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I.       INTRODUCTION 

A  general  purpose  pattern  recognition  system  is  a 
key  component  in  any  comprehensive  computer  vision 
process.  A  software  implementation  of  a  pattern 
recognition  system  is  useful  for  investigating  the 
properties  of  various  feature  extraction  and  image 
classification  techniques  on  a  wide  range  of  geometric 
objects.  No  simple  definition  of  pattern  recognition 
exists  since  it  is  a  broad  field  with  ill-defined 
boundaries.  However,  a  primary  goal  of  pattern 
recognition  can  be  simply  stated.  This  goal  is  to 
extract  specific  information  from  an  input  image  that 
can  be  used  by  the  computer  to  identify  (classify)  or 
understand  the  image.  Pattern  recognition  is  the  task 
of  understanding  an  object  from  its  projected  image 
[11. 

Human  vision  is  an  automatic  process  that  is  often 
considered  independently  of  intelligence.  The  field  of 
artificial  intelligence  (AI)  attempts  to  offer  to  the 
computer   what    the   human    brain    accomplishes    for    its 


vision  process.  The  pattern  recognition  aspect  of 
computer  vision  is  an  area  where  AI  techniques  will  be 
of  vital  importance  in  the  development  of  intelligent 
robots  and  autonomous  vehicles.  No  future  goals  of  a 
general  purpose  computer  vision  process  can  be  imagined 
without   the  application  of   some  AI   techniques. 

Despite  all  the  media  attention  that  AI  has 
received,  its  major  benefits  as  practical  software  are 
not  yet  realizable.  Currently,  AI  systems  are  highly 
specific  and  largely  inflexible.  Various  definitions 
of  AI  exist,  but  the  essence  of  AI  is  the  search  for 
useful,  approximate  solutions  to  very  hard  problems. 
The  complexity  and  lack  of  structure  of  AI  problems 
implies  that  there  are  not  absolute  solutions  to  these 
problems.  Consequently,  the  primary  focus  of  AI 
research   is   the    search   for    "adequate"   solutions. 

An  important  area  of  AI  research  is  knowledge 
representation.  To  be  useful,  knowledge  must  be 
organized  in  such  a  fashion  that  it  is  of  value  to  the 
human  user  and  the  computer  software.  Enclosing  this 
pattern  recognition  system  in  a  "flavor"  structure 
offers    the    advantages    of    object-oriented    programming, 


and  provides  a  useful  method  for  representing 
knowledge.  The  flavor  package  is  a  general  purpose 
software  tool  that  can  be  used  independently  of  this 
pattern   recognition    system. 

This  research  was  a  software  project.  The  software 
was  written  on  the  Electrical  and  Computer  Engineering 
Department's  VAX  11/750  (Digital  Equipment 
Corporation),  using  a  superset  of  the  computer  language 
LISP,  called  INTERLISP,  developed  at  XEROX  Corporation. 
Portability  to  XEROX'S  series  of  AI  workstations  is 
possible. 

In  PART  II,  the  pattern  recognition  problem  is 
discussed  in  detail.  A  computer  vision  system  model  is 
presented  and  its  components  are  analyzed.  The 
important  algorithms  used  in  the  pattern  recognition 
system  are  also  examined,  especially  those  involved  in 
feature  extraction  and  feature    classification. 

In  PART  III,  the  INTERLISP  programming  environment 
and  object-oriented  programming  are  described.  The 
style  of  object-oriented  programming  chosen  for  this 
pattern  recognition   system   is   referred  to  as  a   "flavor" 


system.  All  the  flavor  functions  are  analyzed  in  this 
section.  The  last  section  of  this  part  contains  a 
description  of  how  the  pattern  recognition  system  is 
organized  within  the   flavor    system. 

PART  IV  contains  an  analysis  of  system  performance. 
Timing  functions  are  analyzed  to  demonstrate  how  CPU 
times  vary  with  respect  to  the  area  of  an  image  being 
analyzed.  Next,  the  invariance  properties  of  the  two 
feature  extraction  techniques  are  examined.  The  last 
section  of  this  part  contains  results  from  two  tests 
run  on  the  pattern  recognition  system.  Conclusions  are 
drawn  from  the   results  of    these   tests. 

Appendix  I  contains  output  from  a  sample  session  of 
the  pattern  recognition  system.  It  shows  the  user  how 
to  get  into  the  INTERLISP  environment  and  run  the 
software.  Appendix  II  contains  program  listings  for 
all  the  software  used  in  the  pattern  recognition 
system,    organized   by   file   name. 


II.    PATTERN    RECOGNITION 

A.    BACKGROUND 

The  advent  of  sufficiently  fast  computers  and 
digital  sensing  devices  has  greatly  expanded  the 
efforts  to  create  useful  computer  vision  systems.  The 
pattern  recognition  portion  of  the  computer  vision 
process  begins  after  an  image  has  been  input  and  stored 
in  the  computer  in  digital  form.  This  preliminary 
stage  can  be  likened  to  human  perception.  This  system 
assumes  that  the  input  images  are  stored  in  2- 
dimensional  arrays  (matrices).  All  image  analyses  are 
based  on  only  two  dimensions.  The  number  of  elements 
in  the  array  determines  the  resolution  of  the  images. 
The  number  of  bits  available  to  any  array  element 
determines  the  gray-scale  each  element  can  have.  The 
smallest  element  of  an  image  corresponds  to  a  single 
element  in  the  array  and  is  called  a  picture  element  or 
pixel.  This  system  stores  images  in  32x32  pixel 
arrays.  This  number  can  easily  be  changed  by  the 
software  developer.  The  gray-scale  in  this  system  is 
constrained  to  a  binary  one,  where  each  pixel 
represents  either  an   image   pixel    or   a   background  pixel, 


depending  on  whether  it  is  inside  or  outside  the  region 
occupied  by  the  object.  The  main  function  of  this 
pattern  recognition  system  is  to  provide  a  statistical 
analysis  of  the  input  image,  usually  concluding  with  a 
classification  of    the  image. 

B.    GENERAL    MODEL 

A   general    model    for    a    computer    vision    system    is 
contained    below. 

Figure  1.      Computer  vision  system  model. 
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The  five  principle  activities  of  this  vision  system 
are  outlined  below,  and  will  be  more  thoroughly 
discussed   in  Sections   C  through   H    of   PART    II. 

1.  Data  acquisition  is  the  conversion  of  analog  to 
digital  information  and  storage  of  that  information  in 
data    arrays. 


2.  Low-level  image  processing  consists  of  noise 
reduction  and  various   image   enhancement   techniques. 

3.  Intermediate-level  image  processing  contains  the 
techniques  used  to  extract  information  from  the  image 
data  arrays,    usually   statistical    techniques. 

4.  High-level  image  processing  seeks  to  find  relations 
between  individual  patterns  in  an  array,  eg.  image  and 
scene    understanding. 

5.  The  classifier  consists  of  decision-making 
algorithms  and   output. 

This  research  assumes  that  activity  1  and  parts  of  2 
have  been  previously  accomplished  by  some  combination 
of  software  and  hardware.  Activity  4  is  not  included 
in  this  pattern  recognition  system.  Activity  5 
consists  of  a  classifier  which  attempts  to  match  the 
unknown  input  images  with  a  library  of  pre-processed 
images  to  identify  what  image  class  they  belong  to. 
Another  option,  which  does  not  involve  any  library 
images,    is  also  available  and  is  called  clustering. 

The  goal  of  each  activity  is  to  simplify  the  data 
from  the  previous  one,  so  that  there  are  fewer  total 
pieces    of     information;     but    each    individual     piece 


contains  more  useful  or  meaningful  information.  The 
type  of  information  required  is  dependent  on  the  type 
of   problem  being  solved. 

C.     DATA   ACQUISITION 

Data  acquisition  is  the  first  step  in  the  computer 
vision  process.  Various  types  of  digitizers  are 
available,      including     high    quality  TV     camera 

digitizers.  A  variety  of  sensors,  including  proximity 
and  tactile  ones,  are  also  being  used,  especially  for 
robotic  applications.  Faster  A/D  converters  are  being 
developed  with   the  goal   of   real-time  image  acquisition. 

Problems  with  various  vision  systems  have  often 
revolved  around  the  camera  and  the  video  signal 
standards.  These  standards  were  developed  for  human 
eyes  and  are  not  always  appropriate  for  computer 
vision.  Human  vision  tends  to  be  insensitive  to 
absolute  light  intensity,  slow  variation  in  intensity 
and  spatial  accuracy.  The  eyes  are  well  adapted  to 
detection  of  local  intensity  gradients,  while  global 
differences  in  intensity  are  perceived  only  with  high 
levels  of  contrast    [2]. 


D.    LOW-LEVEL    IMAGE    PROCESSING 

Low-level  image  processing  begins  with  techniques 
used  to  improve  the  overall  quality  of  the  image  stored 
in  the  computer.  These  involve  methods  to  decrease 
the  noise  in  the  images.  For  images  which  have  been 
degraded  or  blurred,  there  are  techniques  to  recover 
some  of  the  original  clarity  using  inverse  or  least- 
square  filtering  [3].  Image  enhancement  techniques  are 
available  which  can  alter  an  image  so  that  it  is  more 
suitable  than  the  original  one  for  some  specific 
application.  These  include  gray-level  transformations 
and  histogram  modifications  [4].  Image  smoothing 
techniques  reduce  errors  introduced  into  the  image  by 
poor  samplings  or  noisy  transmission  channels.  These 
include  neighborhood  averaging  and  low-pass  filtering 
[5].  Image  sharpening  techniques  are  used  to  highlight 
the  edges  of  an  image.  Techniques  include 
differentiation  and  high-pass  filtering  [6].  To  obtain 
a  binary  image,  various  thresholding  techniques  are 
available.  The  functions  in  this  pattern  recognition 
system  can  begin  analyzing  the  image  from  a  binary 
array. 

The    functions    in    this    section    of    the    pattern 


recognition  system  find  and  trace  edges  of  images, 
calculate  the  area  and  center  of  area  for  each  image, 
and  count   the   number   of   holes   in  them. 

After  the  arrays  to  be  analyzed  have  been  stored, 
the  function  Find. Image  is  called.  The  program  listing 
for  Find. Image  is  contained  in  Appendix  II  under  the 
file  name,  TRACE.  This  function  searches  the  input 
array  row  by  row  for  one  image  pixel  that  has  not  been 
previously  analyzed.  This  system  assumes  that 
background  pixels  will  be  represented  by  a  period  and 
image  pixels  will  be  represented  by  an  "x".  Since  more 
than  one  image  per  array  can  be  analyzed,  something 
must  be  done  to  images  that  have  already  been 
processed.  After  the  image  has  been  analyzed,  a 
function  called  Replace. Image. Points  (in  file  CENTER) 
replaces  all  the  image  coordinate  points  with  a 
sequence  number.  This  sequence  number  is  unique  for 
each  of  the  images  in  the  array.  Find.lmage  skips  over 
any  coordinate  points  that  have  integer  values.  When 
Find.Image  is  completed,  it  returns  either  the  row  and 
column  coordinates  of  an  edge  point  for  a  new  image  or 
a  string  containing  an  error  message,  indicating  that 
there   are   no  more   new    images   in   the  array. 
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The  next  function  called  is  Follow. Edge  (file 
TRACE).  Given  the  starting  image  edge  point  returned 
by  Find. Image,  this  function  traces  the  edge  of  the 
image  until  the  starting  point  is  reached  again.  For 
images  which  are  not  closed,  like  the  letter  "X",  the 
algorithm  traces  over  some  image  coordinate  points  more 
than  once,  but  each  image  point  occurs  only  once  in  the 
final  edge  list.  At  each  image  edge  point,  the  search 
algorithm  has  an  ordered  sequence  of  directions  in 
which  to  search  for  the  next  edge  point.  This  list  of 
directions  is  based  on  the  previous  direction  in  which 
an  edge  point  was  found.  By  adapting  the  list  of 
directions  in  this  fashion,  the  algorithm  can  guarantee 
that  it  will  find  the  image  with  the  largest  external 
boundary.  Two  simple  examples  of  this  process  are 
given  below. 

Figure  2.   Edge  tracing  examples. 
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x'  and  y"  represent  the  previous  image  point  found,  x 
and  y  are  the  present  image  points  just  located,  and 
the  numbers  from  1  to  6  represent  the  sequence  of 
directions  which  will  be  searched  for  the  next  edge 
point.  The  seventh  direction  will  be  x'  or  y'.  The 
eighth  direction  is  not  considered  since  it  was  checked 
when  either  x'  or  y'  was  the  present  point  (previous 
step) . 

Follow. Edge  returns  several  important  variables: 
the  list  of  edge  points,  the  number  of  edge  points 
(perimeter),  a  list  containing  the  Freeman  chain  code 
for  the  edge  points  and  a  boolean  flag.  The  Freeman 
chain  code  provides  an  efficient  way  to  store  the  image 
edge  points  [7].  Each  of  the  eight  possible  directions 
from  a  coordinate  point   is  assigned  a   unique   integer. 

Figure  3.      Freeman   code   directions. 
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When  an  image  point  is  found,  the  direction  from  the 
previous  point  to  the  present  point  is  placed  onto  this 
list.  Given  the  starting  image  coordinate  point  and 
the  Freeman  code,  an  identical  image  can  be  traced 
later.  This  code  can  also  be  used  in  future 
enhancements  to  search  for  straight  edges  and  corners. 
The  boolean  flag  is  set  to  TRUE  the  first  time  an  edge 
point  is  repeated.  This  flag  is  used  to  warn  the  user 
that  the  present  image  may  not  be  closed.  For  example, 
the   flag  would   be  TRUE  for   the  letter  "X". 

The  next  function  called  is  Shel l.Sort.Edge  (file 
CENTER).  This  function  sorts  the  edge  coordinate 
points  into  numerical  order,  by  row  then  by  column,  eg. 
(1  1)  (1  2)  (2  1)  (2  2),  where  the  order  of  these 
coordinates  is  row  then  column.  The  sorted  list  is 
returned  in  a  one-dimensional  array.  This  sorting  is 
done  so  that  a  later  function,  Find.Image. Points,  can 
more  efficiently  search  for  all  the  coordinate  points 
in  the  interior  of  the  image.  Sorting  the  edge  points 
allows  Find.Image.Points  to  step  through  the  edge  array 
and  know  when  to  search  between  two  edge  points  for 
interior    points. 
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The  next  function  called  is  Find. Image. Points.  The 
purpose  of  this  function  is  to  search  in  an  orderly 
fashion  for  all  the  coordinate  points  in  the  interior 
of  an  image.  The  algorithm  takes  two  adjacent  edge 
points  from  the  sorted  edge  array,  (these  points  are 
not  necessarily  physically  adjacent)  and  checks  to  see 
if  they  are  on  the  same  row.  If  they  are  not,  then  one 
knows  that  there  cannot  be  any  interior  points  between 
them.  If  they  are  on  the  same  row  and  not  physically 
adjacent,  then  there  is  the  possibility  that  there  are 
interior  points  between  them.  A  sequential  search  of 
all  the  points  between  the  two  edge  points  is  made  and 
all  interior  image  points  are  identified.  This  row 
search  is  done  column  by  column,  from  the  smallest  to 
the  largest  column.  In  addition  to  the  list  of 
interior  points,  lists  containing  the  row  and  column 
numbers  for  each  image  point  are  kept.  These  lists 
will  be  used  later  to  calculate  the  center  of  area  for 
the  image.  The  row  list  will  contain  a  series  of 
sublists,  each  containing  the  row  number  and  the  number 
of  image  points  in  that  row.  The  column  list  will  be 
less  formatted  because  the  search  is  organized  on  a 
row-by-row  basis.  The  next  row  searched  will  always  be 
one  more  numerically   than  the   previous   row,    but    column 
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numbers  can  vary,    depending  on  the  overall   shape  of   the 
image. 

While  searching  between  two  non-adjacent  edge  points 
in  the  same  row,  the  algorithm  is  also  keeping  track  of 
potential  holes  in  the  interior  of  the  image.  The 
first  time  a  background  pixel  is  encountered  in  the  row 
search,  the  coordinate  of  the  last  image  pixel  before 
the  hole  coordinate  is  placed  onto  a  list  containing 
potential  hole  edge  points.  No  other  image  pixels  will 
be  entered  into  the  hole  list  for  that  row  unless 
another  image  pixel  is  found  between  the  two  background 
pixels.  None  of  these  potential  hole  edge  points  may 
turn  out  to  be  actual  hole  edge  points.  Later,  the 
function  Find.No.Of.Holes  will  be  called  to  calculate 
the  number  of  actual  holes  in  the  image.  At  the 
conclusion  of  Find.  Image. Points,  the  lists  containing 
all  the  interior  points,  row  and  column  quantities  and 
potential  hole  edge  points  are  returned  to  the  calling 
routine. 

The  next  function  called  is  Calcul ate. Area  (file 
SET_UP_SYSTEM).  It  adds  the  number  of  image  edge 
points  to  the  number   of    interior    image   points  to  obtain 


15 


the   area   of    the    image. 

The  next  one  called  is  Col. Sums.  This  function 
converts  the  list  of  image  columns  from 
Find. Image. Points  into  an  ordered  list  similar  to  the 
row    list    returned   from    Find.Image.Points. 

Calcul ate. Center. Area  is  the  next  function  called 
(file  SET_UP_SYSTEM).  This  function  calculates  the 
center  of  area  for  the  image.  The  area  and  the  ordered 
row  and  column  lists  are  used  to  obtain  this  coordinate 
value. 

Following  that,  the  function  Find. No. Of  .Holes  is 
called  (file  CENTER).  This  function  is  called  only  if 
the  potential  hole  edge  list  returned  from 
Find.Image.Points  had  at  least  one  coordinate  point  in 
it.  First,  all  the  potential  hole  edge  points  which 
are  also  contained  in  the  list  of  image  edge  points  are 
removed.  Then  the  function  uses  the  remaining  points 
as  starting  points  in  an  edge  search  similar  to  the  one 
contained  in  Follow.Edge.  The  main  difference  is  that 
the  minimum  size  of  an  edge  that  bounds  the  image  is 
desired,    instead   of    the   maximum,    as  was   the   case    in 
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Follow. Edge.  The  search  algorithms  are  similar,  except 
the  ordered  list  of  directions  to  be  searched  is 
shifted.  After  a  hole  has  been  successfully  traced, 
all  the  coordinate  points  contained  in  that  hole  trace 
are  removed  from  the  original  potential  hole  edge  list. 
Then  the  process  is  repeated  until  there  are  no  more 
coordinate  points  in  the  list  of  potential  hole  edge 
points.  The  function  returns  a  list  containing 
sublists,  each  with  a  hole  number  and  the  list  of  image 
coordinates  surrounding  that  hole.  The  only 
restriction  on  this  process  is  that  all  holes  beginning 
near  the  left  edge  of  an  image  must  be  separated  from 
an  edge  pixel  by  at  least  one  interior  image  pixel.  No 
hole  on  the  left  edge  of  an  image  can  have  an  edge 
pixel    bounding    it. 

The  last  function  called  in  the  low-level  image 
processing  section  is  Repl ace. Image. Points  (file 
CENTER).  This  function  replaces  all  the  image  pixels 
in  one  image  with  a  unique  sequence  number.  This 
allows  the  user  to  quickly  spot  the  image  on  a  printout 
of  the  array,  and  lets  the  function  Find. Image  know 
that    that    particular    image    has   already   been   processed. 
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E.    INTERMEDIATE-LEVEL    IMAGE    PROCESSING 

Intermediate-level  image  processing  contains  the 
algorithms  used  to  extract  feature  measurements  from 
the  image.  These  measurements  are  later  used  to 
classify  or  cluster  a  set  of  input  images.  In  general, 
most  feature  extraction  techniques  can  be  divided  into 
two  categories:  decision-theoretic  (statistical)  or 
syntactic  (linguistic).  The  syntactic  approach  will  be 
discussed  more  thoroughly  in  PART  II,  Section  F.  The 
following  paragraphs  will  discuss  the  invariances 
desired  and  examine  the  two  feature  extraction 
techniques  chosen  for   this   system. 

The  goal  of  statistical  pattern  recognition  is  to 
generate  a  finite  set  of  measurements  (feature  vector) 
which  characterize  the  image  being  analyzed.  These 
measurements  should  be  invariant  to  certain 
transformations. 

Features  are  geometrical  properties  of  an  image 
which,  when  combined,  lead  to  its  identification  [8]. 
Ideally,  these  features  should  exhibit  two  of  the  three 
isometries  (invariances)  of  the  mathematical  entity 
known    as    an    affine    geometry    group     [9].       The    first 
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property  of  this  group  is  transl ational  (including 
rotational)  invariance.  Two  geometric  objects 
exhibiting  this  type  of  isometry  are  called  congruent. 
The  second  property  is  size  (scale)  invariance. 
Geometric  objects  with  this  isometry  retain  the 
concepts  of  angle  and  length  ratio,  but  lose  the 
concept  of  absolute  length.  The  last  invariance  of  the 
affine  geometry  group  is  one  that  is  not  desired  in 
this  pattern  recognition  system.  This  is  invariance  to 
shearing  transformations,  where  the  concept  of  angle  is 
lost.  In  an  affine  geometry  group,  all  parallelograms 
are  equivalent. 

The  two  feature  extraction  techniques  chosen  for 
this  pattern  recognition  system  exhibit  the  first  two 
invariances.  Although  the  invariances  are  not  fully 
obeyed,  the  techniques  are  relatively  insensitive  to 
transl ational,  rotational  and  scale  transformations. 
The  output  of  each  feature  extraction  technique  is  an 
N-dimensional  feature  vector.  Each  feature  vector  can 
be  mapped  onto  a  single  location  in  an  N-dimensional 
feature  space.  The  size  of  each  of  these  dimensions  is 
determined  by  the  range  of  values  of  the  corresponding 
feature   vector    element.      By    judicious   choice    of   the 
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feature  extraction  technique,  points  from  "similar" 
geometric  objects  will  be  in  close  proximity  to  each 
other  in  the  N-dimensional  feature  space.  Similarity 
in  this  pattern  recognition  system  refers  to  overall 
shape.  Correspondingly,  geometric  objects  which  are  not 
similar  should  be  widely  separated  from  each  other  in 
the  feature  space.  Many  different  measurements  are 
available  to  calculate  distances  between  points  in  the 
feature  space.  These  will  be  discussed  more  thoroughly 
in  PART  II,   Section  J. 

In  general,  the  ability  to  classify  geometric 
objects  into  M  different  categories  (classes)  is  based 
on  a  division  of  the  N-dimensional  feature  space  into  M 
partitions.  The  ability  of  the  feature  extraction 
technique  to  cluster  similar  geometric  objects  and 
separate  dissimilar  ones  is  a  key  component  in  the 
classification  process.  The  size  of  these  separations 
provides  measurements  of  the  overall  quality  of  the 
feature  extraction  technique.  The  quality  of  the 
feature  extraction  technique  is  a  fundamental 
limitation  to  the  ability  of  the  classifier  to 
correctly  identify  geometric  objects.  Very  little 
general   mathematical    theory    is  available  to  help  in 
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choosing  feature  extraction  techniques.  The  two 
techniques  chosen  for  this  pattern  recognition  system 
are  the  autoregressive  model  and  the  Zernicke  moment 
techniques. 

An  autoregressive  model  is  a  parametric  equation 
that  expresses  each  sample  of  an  ordered  set  of  data 
samples  as  a  linear  combination  of  a  specified  number 
of  previous  samples  from  the  set  plus  an  error  term 
[10].  The  feature  vector  for  this  technique  is  obtained 
by  solving  a  system  of  linear  equations  containing  an 
ordered  sequence  of  terms  approximating  the  shape  of 
the  geometric  object.  These  terms  are  obtained  from 
prior  knowledge  of  the  edge  points  and  the  center  of 
area  for  the  geometric  object.  From  this  information, 
a  series  of  N  angularly  equi-spaced  rays  are  projected 
from  the  center  of  area  outwards  through  the  object 
boundary   (Figure  4,     where  N  =  8). 
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Figure  4.      Image  with  angularly   equi-spaced   rays. 


The  algorithm  contained  in  function  Auto. Reg. Edge 
(file  REG)  checks  each  coordinate  point  in  the  list  of 
edge  points  to  find  which  ones  lie  on  one  of  these  rays 
or  are  closest  to  a  ray.  For  convex  objects,  there 
will  be  only  one  edge  point  that  is  closest  to  each 
ray  (Figure  4).  For  concave  objects,  there  may  be  more 
than  one  edge  point  per  ray  (each  ray  may  intersect  the 
boundary  of  the  image  more  than  once),  depending  on  how 
convoluted  the  object  is.  The  distance  from  the  center 
of  area  to  the  edge  point  closest  to  one  of  these  rays 
is  measured.  An  ordered  set  of  these  distances  is  used 
in  the  autoregressive  model  as  an  approximate  shape 
descriptor.  This  ordered  set  of  distances  can  be 
viewed  as  a  periodic  time  series.  The  specific 
equation  for   the  model    is 


22 


m 


(1)   Rt  =  A  +  </L-   DjRt_.j  +  B1/2Wt 
j=l 

where  t  =  1,  2,  ...  ,  «   and  M  >  N   [11] 


The  definitions  of  the  parameters  in  equation  (1) 

are: 

Rt  =  current  ray  intersection  length 

Rt_j  =  ray  intersection  length  detected  j  rays 
before  the  current  R  term  (collectively 
called  the  lag  terms) 

D  terms  =  unknown  lag  coefficients  to  be  estimated 
from  the  observed  time  series 

m  =  model  order 

B1'2  ■   unknown  constant  to  be  estimated 

B^'2Wt  =   current   error,  residual   noise 

A  =   unknown   constant  to  be   estimated 

{Wt}   =   a   random   sequence   of   independent, 

zero-mean   samples  with   unit  variance. 

Since  the  variance  of  Wt  is  1,  the  B  term  transforms 
the  unit  variance  random  variable  Wt  into  a  random 
variable  of  variance  B. 

The  D  terms  are  estimated  from  equation  (1)  by 
fitting  the  model  to  the  the  ordered  sequence  of  ray 
intersection  lengths  (R  terms),  using  a  least-squares 
method  to  minimize   the  B   term,    which   is   a  measure   of 
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object   shape  noise    [10].      The   system  of   equations   is 
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The  parameter  A  is  found  after  equation  (2)  is  solved, 
using  the  equation 


(3)   A  =  R 


1  *  [i  -  z_ 


j=l 


Dsl 


where  R'  is  the  mean  ray  intersection  length. 
The  B  terms  can  be  estimated  using 

N  m 

(4)    B  =  1/N     <L_    [Ht  -  A  -    Z_  DiRfil 
t=l  j=l    J        J 

where  N   is   the  number   of   ray   intersections. 

Since  A  is  proportional  to  the  mean  ray  intersection 
length,  it  is  a  measure  of  the  overall  size  of  the 
object.      The  larger   an  image  is,   the  larger  its  value 
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for  A  will  be.  The  B  term  is  related  to  the  boundary 
noise  of  the  object,  since  its  value  is  calculated  from 
all  the  ray  intersection  lengths.  The  closer  each  ray 
length  is  to  its  neighbor's  length,  the  smaller  B  will 
be  (a  circle  will  have  a  very  small  B).  Therefore 
A/Bl/2  can  be  used  as  a  measure  of  the  object's  shape 
signal-to-noise  ratio.  The  D  coefficients  simulate  the 
correlated  boundary  variations  and  can  be  used  as  shape 
descriptors. 

Model  parameters  will  remain  approximately  the  same 
for  scaled  or  rotated  images.  Rotation  will  only  shift 
the  ordered  sequence  of  ray  intersection  lengths  so 
that  it  starts  at  a  different  point.  Scale  changes 
will  affect  the  absolute  lengths  of  the  rays  but  not 
their  relative  lengths.  The  primary  differences  will 
be  due  to  the  digital  nature  of  the  images.  The  only 
two  parameters  which  are  sensitive  to  size,  A  and  B, 
are  combined  to  provide  a  measure  that  is  much  more 
scale  invariant.  The  output  feature  vector  for  the 
autor egressi ve  technique  is  the  set  of  D  lag 
coefficients  and   the  A/B1'  ■    term. 

The   algorithm   used   to   obtain   the   feature   vector    for 
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the  autoregressive  technique  is  contained  in  file  REG. 
The  top  level  routine  is  named  Top.Auto. Regressive. 
The  user  is  restricted  to  one  of  three  choices  for  the 
two  key  parameters  in  the  model.  These  two  parameters 
are  the  number  of  rays  that  will  be  drawn  from  the 
center  of  area  and  the  number  of  lag  terms  that  will  be 
used  to  determine  the  overall  model  parameters.  The 
three  choices  are  1  lag  component  and  16  different 
rays,  or  2  lag  components  and  either  16  or  32  different 
rays.  The  restriction  to  2  or  less  lag  components  was 
made  arbitrarily.  The  restriction  on  the  number  of 
different  rays  allows  for  their  slope  values  to  be  pre- 
computed. 

The  function  Top. Auto.  Regr essive  first  calls 
Auto. Reg. Edge  (file  REG).  Auto. Reg. Edge  searches  the 
edge  list  for  ray  intersection  points  or  the  closest 
edge  points  to  a  ray.  First,  the  slope  values  for  the 
rays  are  read  in  from  a  global  variable.  These  slope 
values  have  already  been  calculated  and  are  for  rays  in 
the  first  quadrant,  from  slope  0  to  vertical.  Since 
the  number  of  rays  is  evenly  divisible  by  4,  only  one 
quadrant   of   values   is   required. 
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After  these  values  are  input,  a  loop  is  entered  in 
which  each  edge  point  is  checked  against  the  list  of 
ray  slopes.  Next,  the  numerator  and  denominator  values 
for  the  slope,  from  the  center  of  area  to  the  edge 
point  being  considered,  are  calculated.  The  primary 
purpose  of  this  function  is  to  determine  when  an  edge 
point  crosses  over  or  lands  on  one  of  the  rays.  The 
quadrant  location  and  the  location  of  the  two  rays 
whose  slopes  bound  the  slope  of  the  present  point  are 
found.  If  the  quadrant  is  different  or  the  bounding 
ray  slopes  have  changed  from  the  previous  point  to  the 
present  point,  a  ray  was  crossed  and  an  approximation 
to  a  ray  intersection  was  found.  An  intersection  is 
also  found  when  the  present  edge  point  lands  on  a  ray 
slope. 

After  an  intersection  has  been  found,  its  value  will 
be  placed  into  the  list  only  if  it  differs  from  the 
previous  intersection  placed  into  the  list.  This  is 
necessary  for  the  case  when  an  object  has  a  straight 
line  boundary.  In  such  cases,  certain  orientations  of 
the  object  may  result  in  several  adjacent  edge  points 
being  mapped  onto  one  ray  slope  intersection.  This 
causes   the   resulting   series    of   intersection   lengths   to 
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be  longer  than  the  series  for  the  same  object  with  a 
different  orientation,  biasing  the  model  estimation. 
If  a  new  intersection  has  been  crossed,  then  the  length 
from  the  center  of  area  to  the  edge  point  is 
calculated.  This  length  is  placed  into  a  list.  After 
all  the  edge  points  have  been  analyzed,  Auto. Reg. Edge 
returns  the  ordered  list  containing  the  intersection 
lengths  and  the  number  of  items  in  this  list.  A  graph 
of  these  values  for  a  sample  object  (Figure  5)  are 
contained   in   Figure   6. 

Top.  Auto. Regressive  then  calls  the  function 
Calculate.Sums.  This  function  calculates  the  required 
elements  of  the  matrix  in  equation  (2).  These  terms 
are  obtained  by  summing  the  appropriate  combinations  of 
the    intersection    lengths    returned   from   Auto. Reg. Edge. 

The  last  function  called  by  Top.Auto.Regressive  is 
Sol ve. Auto. Matrix.  This  function  places  the  terms 
calculated  in  Calculate.Sums  in  the  correct  order  in 
the  matrix  and  sets  up  the  solution  array.  The  system 
of  equations  is  then  solved  and  the  A  and  B  parameters 
are  calculated  from  the  D  terms.  Top.Auto.Regressive 
returns    the       feature    vector     containing    the    m    D 
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parameters   and   the  A/B1'  "    terra. 

Figure  5  represents  an  array  containing  a 
complex  image.  This  image  is  analyzed  using  the 
autoregressive  feature  extraction  technique.  The  list 
of  ray  intersection  lengths  obtained  from  the 
autoregressive  algorithm  is  then  plotted  to  show  what 
the  function  accomplishes.  The  plot  is  on  the 
following  page.  Figure  5  shows  only  a  portion  of  the 
3  2x3  2  data  arrays  and  is  numbered  for  convenience.  The 
area  of  this  image  is  68  and  the  center  of  area  is 
(9.84  14.54),  with  the  row  number  first,  then  column 
number.  The  approximate  location  of  the  center  is 
marked  with  a  "c".  Image  pixels  are  marked  with  an 
"x".  The  ray  search  begins  at  edge  coordinate  (5  16), 
which  was  the  last  edge  pixel  found  in  the  edge  search, 
and  proceeds    counter-clockwise. 
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Figure  5.      Sample   image. 
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Figure  6.   Ordered  List  of  Ray  Lengths 
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The  second  feature  extraction  technique  used  by  this 
pattern  recognition  system  is  based  on  Zernicke  moment 
invariants  [12].  The  first  published  paper  on  the  use 
of  image  moment  invariants  for  pattern  recognition 
appeared  in  1962  [13].  From  the  theory  of  statistical 
moments,  a  uniqueness  theorem  states  that  if  f(x,y)  is 
piecewise  continuous  and  has  nonzero  values  only  in  a 
finite  part  of  the  x-y  plane,  then  moments  of  all 
orders  exist,  and  the  moment  sequence  (Mpq)  is  uniquely 
determined  by  f(x,y)  and,  conversely  (Mpq)  uniquely 
determines  f(x,y)  [14].  The  problem  is  how  to  avoid 
having  to  calculate  high  orders  of  moments,  yet  still 
retain  enough  information  to  correctly  classify  an 
image. 

In  the  analog  world,  given  a  continuous  function  of 
two  variables  x  and  y,  any  moment  of  order  (p+q)  can  be 
described  by  the  following  equation 

(5)  Hpg  =   J]     xPy<3f(x,y)    dx  dy 

where  R    is   the   region  within  which   the  object   lies. 
The   digital    approximation  to    (5)    is 

(6)  M       =  Z_  xPy"3f(x,y) 

x,y   in  R 
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If  the  images  are  restricted  to  binary  ones,  as  in  this 
system,  f(x,y)  is  restricted  to  1  or  0.  In  this  case, 
equation  (6)  can  be  simplified  using  a  discrete  version 
of  Green's  Theorem  [15].  This  theorem  relates  a  double 
integral  (double  summation)  over  a  bound  area  to  a  line 
integral  (single  summation)  over  the  boundary  of  the 
object.  The  application  of  this  theorem  to  the  moment 
equation  yields  the  following; 

E 
(7)      Mpg   =    (1/n+l)    <£_    (XiP+1y1«)S1 

where  Bj  =  sign  (yi  -  y(i_1)mod  E) , 

E  is   the  number  of  edge   points   in  the   boundary 

and   sign    is  the  sign   operation,    returning   either 
+1  or  -1      [16]. 

This  approximation  yields  a  significant  savings  in  the 
total  number  of  mathematical  operations  required  to 
obtain  the  moments.  The  long  version  for  obtaining  the 
moments,    equation    (5),    is  still    available  to   the   user. 

The  next  operation  on  the  moments  is  to  centralize 
them.  This  operation  makes  them  invariant  to 
translation.  The  discrete  equation  for  centralizing 
binary  moments   is 
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(8)  mu  =    Z_  (X-X')p(y-y')q 

x,y  in  R 

where  x1  and  y'  are  the  mean  x  and  y  values  for  the 
image. 

To  obtain  scale  invariance,  the  centralized  moments 
are  normalized.  The  general  equation  for  normalizing 
the  centralized  moments  is 

(9)   Phipg  =  MUpq/Mua00 

where  a  =  1  +  [(p+q)/2]     for  p+q  >  1. 

Rotational  invariance  is  difficult  to  achieve.  Hu 
[13]  derived  rotational  invariance  by  combining 
specific  normalized,  centralized  moments.  Another  way 
to  do  this  is  to  use  the  circular  polynomials  of 
Zernicke.  These  polynomials  were  developed  by  Zernicke 
while  studying  light  diffraction  from  small  aberrations 
in  1934  [17].  His  polynomials  have  the  property  of 
being  orthogonal  over  the  interior  of  the  unit  circle 
and,  unlike  most  others  with  this  orthogonality,  also 
contain  basic  invariant  properties  including  rotational 
invariance.  The  polynomials  have  the  following  form, 
in  two  real   variables, 
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(10)  Vnl(x,y)  =  Rnl(p)exp(ilo) 

These  functions  are  complete  over  the  unit  circle  and 
satisfy  the  following  equation 

(11)  ]\  &x6ylVnlU,y)]*VmkU,y)    =   dmndkl/  (n+1) 

where   the   integration   is   over    the   unit    circle 
(x2  +  y2)    <=  1 

The   last   two  terms   in  the  numerator  of   the  right 
half   of   the   equation  are   Kronecker   deltas. 

Because  of  the  nature  of  geometric  objects,  only  the 
real  portions  of  equation  (10)  are  considered.  By 
expansion,  an  equation  is  derived  that  can  be  used  to 
calculate  the  Zernicke  moments  from  the  appropriate 
combinations  of  normalized  and  centralized  moments.  A 
complete  description  of  the  moments  is  contained  in 
the  comments  of  function  Get. Zernicke. Moments  in  file 
MOMENT.  A  thorough  derivation  of  this  problem  is 
contained  in  reference  [17].  A  maximum  of  11  different 
Zernicke  moments  were  used  (up  through  fourth  order 
moments)    in   this   pattern   recognition   system. 

The  algorithms  used  to  calculate  the  Zernicke 
moments  are  contained   in  the  file  MOMENT.      The  first 
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function  called  is  Calcul  ate. Zernicke. Moments.  This 
function  calls  Calculate.Moments,  which  calculates  the 
moments  up  through  the  fourth  order  from  the  list  of 
edge  points.  Then,  Central. Moments  is  called  to 
normalize  and  centralize  the  moments.  Finally, 
Get. Zernicke. Moments  is  called  to  calculate  the 
Zernicke  moments  from  the  normalized  and  centralized 
ones.  If  the  user  desires  to  calculate  the  moments 
using  both  edge  and  interior  points,  then  a  different 
top  function  is  called,  Calculate.Zernicke.Long.  This 
function  calls  a  slightly  different  function, 
Calculate.Real. Moments,  to  obtain  the  moments  from  all 
the  image  points.  After  that,  the  other  two  functions 
described  above  are  called.  The  output  from  either  of 
these  two  routines  is  a  2,  6  or  11  element  feature 
vector   containing  the   Zernicke  moments. 

F.    HIGH-LEVEL    IMAGE    PROCESSING 

For  some  applications,  the  feature  vector  obtained 
from  intermediate  statistical  processing  is  inadequate 
for  thoroughly  analyzing  the  image  or  scene.  In  3- 
dimensional  image  analysis,  high-level  processing  is 
especially  important.  What  is  most  often  required  in 
such  an  analysis  is  a  hierarchical  description  of  the 
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scene.  Structural  understanding  of  a  scene  is  not 
something  most  intermediate-level  processing  can 
accomplish.  Syntactic  (linguistic)  analysis  is  much 
better  suited  to  accomplishing  the  goals  of  high-level 
processing. 

At  its  most  basic  level,  a  syntactic  language 
consists  of  a  set  of  symbols  (alphabet)  and  a 
corresponding  set  of  statements  (production  rules)  that 
detail  how  the  symbols  can  be  manipulated  in  order  to 
create  higher  order  structures  [18].  The  symbols  can 
be  letters  or  simple  geometric  shapes.  If  one  uses 
simple  geometric  shapes,  the  production  rules  can 
describe  how  the  shapes  will  be  put  together  to  create 
more  complicated  geometric  objects.  By  looking  at  the 
ordered  sequence  of  symbols  that  was  strung  together 
(hierarchy),  one  has  the  basis  for  a  pattern 
recognition  system.  After  a  new  image  is  input,  the 
algorithm  (parser)  tries  to  construct  the  same  image 
from  basic  components  (alphabet)  using  the  production 
rules.  Very  complicated  scenes  can  be  analyzed  using 
this  technique.  For  simple  geometric  objects  most 
intermediate-level  statistical  processes  are  faster  and 
more    efficient. 
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G.    CLASSIFIER 

The  classifier  is  the  last  component  in  the  computer 
vision  process.  It  was  introduced  in  PART  II,  Section 
E.  Classification  includes  the  decision-making 
algorithm  that  decides  which  image  class  the  unknown 
object  belongs  to.  The  central  assumption  underlying 
the  classification  process  is  that  a  set  of  N- 
dimensional  feature  vectors  from  one  type  of  geometric 
object  (one  image  class)  will  be  points  in  close 
proximity  to  each  other  in  the  N-dimensional  feature 
space,  based  on  some  type  of  distance  measure. 
Correspondingly,  feature  vectors  from  different  image 
classes  should  be  points  widely  separated  in  the 
feature  space.  A  slight  geometric  distortion  in  an 
object  from  one  image  class  should  result  in  a  small 
physical  displacement  of  that  image  point  relative  to 
other   points   in  the   same    class    [19]. 

The  N-dimensional  feature  space  is  partitioned  into 
M  sections  for  a  classifier  with  M  different  image 
classes.  Ideally,  there  should  not  be  any  overlap  of 
partitions.  Overlapping  regions  require  more 
sophisticated  classification  techniques  in  order  to 
separate  the  image  classes. 
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The  number  of  classes  in  a  classification  process  is 
determined  only  by  the  particular  type  of  problem  being 
solved.  Large  numbers  of  classes  should  be  avoided 
because  each  must  be  checked  when  the  classifier  is 
being  run.  This  pattern  recognition  system  allows  the 
user  to  dynamically  adapt  the  classifier  by  adding  new 
classes  or  adding  new  images  to  previously  defined 
classes. 

Once  the  feature  space  has  been  partitioned,  an 
appropriate  distance  measuring  algorithm  must  be 
chosen.  For  a  system  with  small  feature  element 
variability  within  each  class  and  wide  separation 
between  different  image  classes,  a  simple  linear 
Euclidean  distance  measure  can  be  used.  This  system 
contains  two  metrics.  One  is  the  Minkowski  metric  of 
order  s,  where  the  the  differences  between  the  feature 
elements  in  the  two  feature  vectors  are  raised  to  the  s 
power  and  summed.  The  final  result  is  raised  to  the 
1/s  power  (s  in  this  system  is  restricted  to  1,  2  or 
3).  The  other  metric  is  the  Chebychev,  where  the 
distance  between  two  feature  vectors  is  chosen  to  be 
the  largest  of  the  differences  between  the  individual 
elements   in  the  feature  vector. 
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Next,  the  decision-making  portion  of  the  classifier 
examines  the  distances  between  the  unknown  image  and 
each  of  the  image  classes  and  selects  class  membership 
based  on  the  smallest  distance.  These  distance  metrics 
must  be  used  cautiously  because  it  is  not  always 
accurate  to  say  that  the  farther  away  two  points  are  in 
the  feature  space,  the  more  dissimilar  their 
corresponding   geometric   shapes   are    [20]. 

The  function  that  contains  the  above  metrics  is 
Distance. Classifier  in  the  file  CLASSIFY.  In  this 
function,  the  distance  (user-selected  metric)  between 
the  unknown  feature  vector  and  one  feature  vector  from 
each  image  class  is  calculated.  An  array  containing 
these  distances  is  returned  for  later  use  by  the 
decision-making  function  CI assify. Decision  (file 
CLASSIFY).  Another  set  of  measurements  is  also 
calculated  in  Distance. Classifier.  This  set  contains 
variance  measurements  for  the  distances  between  feature 
vectors.  The  sum  of  the  variances  for  the  differences 
used  to  calculate  the  final  Euclidean  distance  is 
placed  into  an  array,  with  one  variance  per  image 
class.  The  user  can  use  this  measure  to  decide  between 
two  distances  that  are  very   close   together. 
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Classify. Decision  is  called  after 
Distance. Classifier  is  completed.  This  function 
searches  the  distance  array  for  the  two  smallest 
distances  and  returns  the  corresponding  array  sequence 
numbers.  The  ratio  of  the  two  closest  distances  is 
also  returned,  giving  the  user  an  idea  of  the 
separation  of  the  two  closest  image  classes.  The  ratio 
of  the  same  two  array  elements'  variances  is  also 
returned. 

A  more  sophisticated  classifier  is  available  to  the 
user  in  this  pattern  recognition  system.  This 
technique  transforms  the  N-dimensional  feature  space 
into  a  new  N-dimensional  feature  space  where  the 
within-class  variability  has  been  reduced.  The 
between-class  distances  are  not  greatly  affected.  A 
simple  transformation  can  be  obtained  by  weighting  the 
individual  elements  in  a  feature  vector  differently. 
This  is  conceptually  appealing,  since  all  the  elements 
in  a  feature  vector  rarely  contribute  equally  to  the 
overall   image  classification. 

To  obtain  smaller  within-class  distances  for  those 
image   classes  with  more   than  one  feature  vector,    the 
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statistical  variance  of  each  element  in  the  feature 
vector  is  calculated.  Feature  elements  are  then 
weighted  inversely  proportional  to  their  variances. 
Therefore,  a  feature  element  from  one  image  class  which 
does  not  vary  much,  will  be  weighted  larger  than  an 
element  whose  value  is  not  as  constant.  Ideally,  if  an 
element  remains  constant,  its  weight  should  be  infinite 
and  all  the  rest  of  the  weights  should  be  zero.  To 
test  for  class  membership,  only  the  one  constant 
element  needs  to  be  checked.  The  problem  with  this 
approach  is  that  the  resulting  classifier  is  often  less 
efficient  than  the  simple  Distance.Classif ier  discussed 
previously,  because  only  one  element  is  used  to 
determine  class  membership.  To  remedy  this  situation, 
any  element  with  zero  variance  is  arbitrarily 
reassigned  a  variance  20  times  smaller  than  the  next 
smallest  variance.  This  allows  all  elements  in  the 
feature  vector  to  be  used  when  calculating  distances. 
For  a  non-adaptive  classifier  of  this  type,  the 
individual  element  weights  for  each  class  feature 
vector  need  to  be  calculated  only  once. 

Mathematically,     this    technique    yields    feature 
element   weights    that    represent    the    solution    to    the 
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minimization  problem  of  the  within-class  distances 
[21].  A  further  constraint  on  the  weights  is  added  to 
disallow  the  transformation  that  merely  shrinks  the 
entire  N-dimensional  feature  space.  This  shrinkage 
satisfies  the  overall  criteria  of  decreasing  the 
within-class  distances  but  nothing  has  been  gained. 
The   equation  to  be  minimized   is 


N 
(12)      D2   .Z-  w     2  (A     -   BJ2 


nn    v"n        "n' 
n=l 

where    D    is    the    distance,     N    is    the    number    of 

feature    vectors    in    that    class,    and    A   and    B 

represent  two  feature  vectors  in  the   class. 

This  distance  is  to  be  minimized,  not  only  for  A  and  B, 
but   for    all    images   in   the   class.      The  added   constraint 

is 

N 

<13>        U    wnn  =   1 
n=l 

A  complete  derivation  of  this  problem  is  outlined  by 
Sebestyen  [21].  The  distance  in  the  transformed  N- 
dimensional  feature  space  between  point  p  in  the 
unknown  image  and  one  class  of  images  F  is 
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N  N 

(14)    D   =    (     I  I     a   )2/N    t    JT  ({p     _   f    }/a   )2   +   N] 
p=l      "  n=l 

where   a    is   the  variance    of    an  element,    N  is  the 

number  of  elements  in  the  feature  vector,  p  is  the 

unknown  feature  vector   and  V    is  the  sample  mean 

feature   vector. 

The  algorithm  used  to  calculate  these  weights  and 
distances  is  contained  in  function  Weighted.Classif ier 
in  file  CLASSIFY.  Wei gh ted. CI assif ier  calls 
Weight. Library  to  calculate  the  variances  of  the 
feature  elements  for  each  class.  Weight.Library  returns 
a  1-dimensional  array  containing  the  means  and 
variances  for  the  feature  elements  in  one  image  class. 
Weighted.Classifier  calls  Weight.Library  once  per  image 
class  and  places  all  the  return  arrays  into  a  2- 
dimensional  array.  At  the  end  of  the  function,  the 
weights  for  each  feature  element  in  each  class  are 
calculated  from  the  appropriate  means  and  variances. 
The  distance  from  the  unknown  feature  vector  to  each 
image  class  is  calculated  (15)  and  returned.  After  the 
distances  are  calculated  in  Weighted.Classifier, 
CI assif y.Deci sion  is  called  to  decide  on  class 
membership.      This    is    the    same    routine    that   was    used 
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after  Distance. Classifier  was  called.  All  the  same 
quantities  are  calculated  in  Distance. Classifier  as 
before   except    for    the    ratio    of    the   distance   variances. 

The  library  of  images  used  to  partition  the  N- 
dimensional  feature  space  is  an  important  aspect  of  the 
classifier.  Several  functions  in  this  pattern 
recognition  system  deal  specifically  with  the  image 
library.  Both      D i s t a n c e. C  1  a s s i f i e r      and 

Weighted.Classif ier  require  that  the  library  of  images 
already  be  formatted.  This  means  that  the  feature 
vectors  for  the  library  images  must  already  have  been 
calculated  and  placed  into  lists  according  to  their 
class  membership.  Formatted  library  files  can  be 
loaded  to  and  from  storage  using  the  two  functions 
Create. Library. From.  File  and  Store.Library.To.File. 
These  are  contained  in  the  file  SET_UP_SYSTEM.  If  no 
library  files  are  available,  or  the  user  wants  to  start 
a  new  one,  an  adaptive  classifier  can  be  chosen.  Using 
the  function  Add. To. Library  (file  SET_  DP_  SYSTEM) ,  the 
user  is  prompted  for  the  name  of  the  image  class.  The 
function  then  places  the  new  feature  vector  and  new 
name  into  the  formatted  library.  The  same  process  is 
used   to   add  an   additional    feature   vector   to  an  already 
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existing  image  class.  The  user  has  a  lot  of 
flexibility  in  designing  the  desired  type  of  library 
and  classifier. 

The  last  function  available  to  the  user  in  the 
classifier  is  one  that  calculates  four  measures  of 
classifier  quality.  All  four  measures  are  based  on 
differences  between  within-class  distances  and  between- 
class  distances  [22].  This  function  is  called 
Quality.Of. Classifier  (file  CLASSIFY).  Each  of  these 
measures  requires  the  calculation  of  two  matrices:  the 
within-class  scatter  matrix  and  the  between-class 
scatter  matrix.  Both  are  square  matrices  with 
dimensions  equal  to  the  number  of  elements  in  the 
feature  vector.  The  between-class  matrix  can  be 
represented  as 


(15)      B  =   2_  Pidtij-   m)  (mi-   m)T 


i-1 

where  c  is  the  number  of  different  image  classes, 
P^  is  the  probability  of  occurrence  of   that 
particular  image  class  (all  probabilities  in  this 
system  are  assumed  to  be  equal),  m^  is  a  vector 
containing  the  mean  feature  elements  for  the  ith 


46 


image  class,  and  m  is  a  vector  containing  the 
mixture  sample  mean  (calculated  from  all  the 
feature  vectors  from  all    the  image   classes). 

The  within-class  matrix   can   be   represented  as 

(16)  W  =    >       Pi/ni     T.  (eik  -  H)  (eik  -  m^ 

i=l  k=l 

where  c,  P^,  and  m^  are  the  same  as  in  equation 
(15),  n^  is  the  number  of  features  in  the  feature 
vectors  and  e^k  is  the  kth  feature  vector  in  the 
ith  image  class. 

The  first,  and  simplest,  quality  measure  (using  the 
B  and  W  matrices  described  above)  is 

(17)  C^    =    TRACE     (B   +    W) 

where  TRACE   is   the  matrix  operation  of   that  name. 

This  is  the  only  measure  that  can  be  calculated  for 
image  classes  containing  only  one  feature  vector  (W 
equals  the  zero  matrix).      The   second  measure   is 

(18)  Q2    =    TRACE     (B) /TRACE     (W) 

This  measure  is  a  more  realistic  measure  of  classifier 
quality   than  the  previous  one  because  it  is  the  ratio 
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of  the  two  matrices'  traces  instead  of  the  sum.  It 
still  does  not  account  for  the  effect  on  the  distance 
between  classes  caused  by  the  correlation  of  feature 
vector  elements.  To  avoid  this  problem,  W  can  be 
altered  by  a  suitable  transformation  0,  so  that  the 
average  covariance  matrix  of  the  transformed  vector  is 
the    identity    matrix    (I)      [23]; 

(19)  UTWU  =   I  therefore     0  =  W"1/2 

The   third  measure  of   classifier  quality    is  then 

(20)  Q3    =   TRACE    (W_1B) 

The  last,  and  most  comprehensive  measure  of  classifier 
quality  is 

(21)  Q4    =   DETERMINANT    (B   +    W)  /   DETERMINANT    (W) 

Q4  has  the  advantage  over  Q3  in  that  it  is  less  likely 
to  select  a  set  of  features  that  allows  good  separation 
between  only  a  few  of  the  total  number  of  classes,  at 
the  expense  of  all  other  class  separations  [24].  (All 
the  matrix  operations  required  to  solve  these  equations 
are  contained  in  the  file  MATRIX.)  As  the  quality  of 
the  classifier  increases,  so  should  the  absolute  value 
of   all    four    of    these   measures.      In   an  adaptive   system, 
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the    user    can    use    these   measures    to    see   how   much   the 
classifier   is   improving. 

H.    CLUSTER   ANALYSIS 

Cluster  analysis  is  actually  a  form  of 
classification  that  can  be  used  when  no  library  images 
are  available  or  desired.  Clustering  is  a  technique 
used  to  group  a  set  of  feature  vectors  by  a  distance 
metric,  and  is  a  well-known  analytical  tool  in 
statistics.  Clustering  can  be  used  to  test  hypotheses 
concerning  similarities  and  differences  between  groups 
of   images;    it   is  a  tool   for   investigation. 

The  type  of  clustering  available  in  this  pattern 
recognition  system  is  hierarchical  clustering.  The 
algorithm  is  contained  in  the  function  Clustering 
(file  CLASSIFY).  Before  this  function  is  called,  the 
user  must  choose  one  of  the  two  available  feature 
extraction  techniques:  autoregressive  or  Zernicke 
moments.  Then  the  user  must  choose  the  distance  metric 
to  be  used  in  function  Distance. Classifier  (file 
CLASSIFY).  The  available  ones  are  the  Minkowski 
metrics   of   order  1,    2   or  3,    or   the   Chebychev  metric. 
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The  main  loop  in  Clustering  continues  until  all  the 
input  feature  vectors  have  been  combined  into  one 
cluster.  In  each  loop  iteration,  the  distance  between 
each  feature  vector  and  all  other  feature  vectors  is 
measured,  and  this  value  is  stored  in  a  2-dimensional 
array.  After  all  these  distances  have  been  calculated, 
the  smallest  one  is  located.  The  two  feature  vectors 
that  correspond  to  this  distance  are  then  averaged  and 
this  new  averaged  vector  replaces  the  first  of  the  two 
vectors.  The  other  feature  vector  is  eliminated  from 
the  list.  Each  iteration  of  the  outer  loop  reduces  the 
number  of  feature  vectors  by  one.  Averaging  is  one  of 
the  simplest  methods  of  combining  the  two  feature 
vectors.  Other  more  sophisticated  techniques  are 
possible    [25]. 

The  function  Clustering  returns  an  ordered  list  of 
sublists,  each  containing  the  pair  of  sequence  numbers 
for  the  feature  vectors  that  were  clustered  and  the 
distance  between  them.  The  list  is  in  reverse  order, 
with  the  last  two  feature  vectors  clustered  first. 
From  this  information,  a  dendogram  can  be  plotted.  A 
dendogram  looks  like  a  binary  tree  with  each  node 
containing  exactly   two  offspring.      The   dendogram's   X 
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axis  contains  the  sequence  number  (name)  of  the  feature 
vectors  and  the  Y  axis  is  distance.  Examples  are 
contained   in   PART  IV. 
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III.   THE  PROGRAMMING  ENVIRONMENT 

A.  BACKGROUND 

In  any  large  software  system  serious  consideration 
must  be  made  concerning  methods  for  program  development 
and  the  programming  style  to  be  used.  Ideally,  this 
should  be  done  prior  to  writing  any  software.  Key 
concerns  are  the  computer  hardware  and  the  computer 
language  to   be    used. 

The  use  of  the  Electrical  and  Computer  Engineering 
Department's  VAX  11/750  was  a  practicable  decision, 
given  its  availability.  The  choice  of  a  superset  of 
LISP,  called  INTERLISP,  as  the  programming  language  was 
made  primarily  to  allow  for  the  development  of  a  higher 
level  software  shell,  using  various  concepts  from  the 
field  of  artificial  intelligence  (AI).  Its 
availability  on  the  VAX  was  also  a   key  factor. 

B.  LISP 

LISP  (List  Processing)  was  developed  in  the  1950's 
by  John  McCarthy  at  M.I.T.  It  was  based  largely  on 
work  done  by  Alonzo  Church  (on  lambda  calculus),  that 
dealt     with     symbolic,      as     opposed     to     numerical 
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calculations    [26], 

LISP  has  a  very  simple  basic  structure  in  which  all 
data  is  in  the  form  of  symbolic  or  s-expressions.  S- 
expressions  consist  of  two  types  of  objects:  (1)  atoms, 
which  are  symbols,  numbers  or  strings  and  (2)  lists, 
which  can  contain  other  lists  and/or  atoms  nested  to  a 
finite  degree.  The  great  nemesis  of  the  LISP 
programmer,  the  parenthesis,  is  used  to  mark  the 
beginning  and  end  of  each  s-expression.  There  are  no 
restrictions  placed  on  these  s-expressions,  therefore 
it  is  possible  to  accomplish  almost  anything  in  the 
language,  including  self-modifying  code  and  functions 
that  can  create  and  execute  other  functions.  It  is 
easy  to  create  functions  that  take  other  functions  as 
parameters.  One  important  advantage  of  LISP  is  that 
the  language  makes  no  distinction  between  functions 
(executable  code)  and  data.  This  characteristic  gives 
LISP  much  of  its  power  and  richness.  Its  primary 
appeal,  in  constructing  a  pattern  recognition  system, 
is  the  ability  to  write  higher-level  structures 
(shells)  that  provide  sophisticated  controls  to  the 
software  developer.  An  object-oriented  programming 
structure  was    built   and   will    be    described    in      more 
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detail    in  PART   III,    Section   G. 
C.     INTERLISP 

INTERLISP  (INTERactive  LISP)  is  a  superset  of  LISP 
developed  at  XEROX  Corporation  in  the  1970's  [27].  The 
version  that  runs  on  the  VAX  was  written  at  the 
University  of  Southern  California.  INTERLISP  has  a 
very  powerful  and  rich  development  environment.  It 
contains  the  following   important  aspects: 

1.  a  sophisticated  structure  editor,  which  helps  keep 
track   of  nesting  and   parentheses, 

2.  an  interactive  debugger,  which  allows  the  developer 
to  view,    or   change,    stack   contents, 

3.  a  print  facility,  which  outputs  structured  code, 
clearly  showing  the  level    of   code  nestings, 

4.  a  program  analysis  package,  which  can  be  used  to 
obtain  a  picture  of  the  overall  flow  of  control  within 
a    program, 

5.  forgiveness  and  automatic  correction  for  some  types 
of  errors    (especially  spelling  errors), 

6.  a  mechanism  for  undoing  the  results  of  previous 
operations,  which  allows  the  developer  to  keep  several 
different  versions  of  the  same  function  running 
simultaneously    [28]. 
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Unlike  many  other  interactive  languages,  INTERLISP 
can  also  be  compiled.  Compiling  improves  the  overall 
execution  speed  by  approximately  35-fold.  Another 
important  aspect  of  the  language  is  that  variable 
bindings  are  deferred  until  execution  time.  This  can 
save  the  developer  time  when  prototyping  new  software. 
None  of  these  benefits  come  without  some  cost.  One 
disadvantage  of  INTERLISP  is  its  complexity.  It  is  a 
difficult  language  to  learn  to  use  well.  Another 
disadvantage  is  the  large  amount  of  physical  memory 
required  and  the  corresponding  reduction  in  execution 
speed.  Along  with  the  INTERLISP  Reference  Manual 
[IRM],  another  excellent  source  of  information  about 
the  language  is  ISS££LIS£  Sh£  ia.DS.u.ag.e  a.n.d  Jt.§  U3S3S 
[291. 

D.     PROGRAMMING    STYLE 

With  a  language  as  rich,  powerful  and  unstructured 
as  INTERLISP,  certain  guidelines  must  be  established  so 
that  the  finished  software  product  is  readable, 
understandable  and  maintainable.  The  following 
guidelines  were  used  in  the  development  of  this  pattern 
recognition    system. 
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1.  The  use  of  recursion  is  limited,  since  it  is  often 
hard  to  follow  what  is  being  accomplished  in  such  a 
procedure. 

2.  The  number  of  global  variables  is  minimized.  Most 
of  the  variables  used  in  this  system  are  declared  in  a 
PROG  statement  which  localizes  them  to  the  defining 
function. 

3.  A  RETURN  statement  for  the  variable(s)  to  be 
returned  from  the  function  call  is  used.  This  is 
necessary  when  PROG  is  used,  and  allows  the  developer 
to  clearly  see  the  return  values   from  the  function. 

4.  CLISP  (Conversational  LISP)  statements  are  used 
wherever  possible  for  conditional  statements  and 
looping  blocks.  CLISP  statements  include  IF-THEN-ELSE, 
DO-WHILE  and  DO-UNTIL.  These  produce  a  more  readable 
code,    especially   for   those   less  familiar  with  LISP. 

5.  All  new  function  names  defined  by  the  developer  have 
their  first  letters  capitalized,  and  multiple  words 
separated  by  periods,  eg.  Find. Image. Points.  Each 
function  accomplishes  one  set  of  calculations  and  their 
names   are  meaningful. 

6.  All  functions  previously  defined  by  INTERLISP  are 
fully   capitalized,    eg.    PROG. 

7.  All    user-defined   variable   names  are    in   small 
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letters  with  different  words  separated  by  periods,  eg. 
start.pt.      Their   names   are   also  meaningful. 

8.  Suffixes  at  the  end  of  variables  are  used  to  denote 
their  data  types  when  the  variable  name  itself  is  not 
sufficient,  eg.  counter. i  is  an  integer,  difference. r 
is  real,  image. pixel. ch  is  a  one  letter  character, 
extra. flag. b  is  a  boolean  flag  (its  value  is  either  T 
or  NIL),  and  begin.pt. c  is  a  cartesian  coordinate  (row 
first    then   column). 

9.  An  error  message  returned  by  a  function  is  a  string 
variable,  so  that  the  calling  routine  can  quickly  check 
for  an  error  condition.  No  strings  are  returned  by  a 
function  that  successfully   completes   its  task. 

10.  No    self-modifying   code    or    stack  manipulations  are 
used. 

11.  Input/output    is   limited   to  a   few   functions. 

12.  All   software   is  well   commented. 

E.    ARTIFICIAL    INTELLIGENCE 

INTERLISP  is  an  excellent  language  to  use  for 
developing  AI  software.  The  application  of  AI  to  this 
pattern  recognition  system  was  a  modest  one.  The  area 
of  AI  of  interest  in  this  system  was  knowledge 
representation.      The  flavor  style  of  object-oriented 
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programming  used  within  it  has  several  features  that 
make  it  useful  for  knowledge  representation.  These 
will  be  discussed  in  more  detail  in  PART  III,  Section 
F. 

AI  is  a  complex  and  controversial  field.  At  the 
present  time,  it  is  in  a  transition  phase  between 
theoretical  and  practical  applications.  An  AI  system 
is  using  intelligence  when  it  has  the  ability  to  focus 
on  the  most  relevant  knowledge  and  to  ignore  less 
relevant  knowledge  [30].  How  its  knowledge  base  is 
represented  will  play  an  important  part  in  the  overall 
AI  system's  ability  to  find  adequate  solutions  to  the 
problem  being  solved.  Knowledge  should  be  logically 
organized  and  easy  to   understand,    access  and   update. 

AI  is  built   on  the  following  three  ideas: 

1.  knowledge   of   the   domain   of   interest, 

2.  methods   for   operating  on  that  knowledge, 

3.  control    structures    for    choosing    the    appropriate 
methods   for   modifying  the   knowledge   base    [31]. 

The  power   of  an  AI   system  comes  from  the  knowledge  it 
has  more   than  anything  else. 
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Expert  systems  are  one  area  of  AI  research  that  have 
achieved  a  fair  amount  of  success  in  the  past  several 
years.  They  are  computer  programs  that  solve  difficult 
problems  requiring  expertise,  by  utilizing  facts  and 
heuristics  (rules  of  thumb)  that  the  human  expert  would 
consider.  Expert  systems  can  be  distinguished  from  the 
broad  class  of  AI  problems  by  the  fact  that  they  solve 
problems  within  a  restricted  domain  and  attain  expert 
levels  of  performance    [32]. 

Knowledge  plays  a  key  role  in  the  development  of  an 
expert  system.  Expertise  itself  consists  largely  of 
knowledge  about  specific  tasks  or  problem  areas.  The 
main  problem  for  the  expert  system  developer  is  how  to 
locate  and  store  knowledge,  so  that  the  computer  can 
use  it  efficiently  (knowledge  engineering).  There  are 
four  different  paradigms  (conceptual  tools)  used  in 
developing  expert  systems,  and  they  affect  how 
knowledge  will  be  represented.  The  first  paradigm  is 
the  procedural  or  function-oriented  approach.  LISP  is 
such  a  paradigm.  The  second  is  the  rule-oriented 
approach,  where  the  knowledge  base  consists  of  a  list 
of  IF-THEN  rules  used  to  arrive  at  a  conclusion.  The 
third   is   the   data   or   access-oriented  approach,   where 
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the  message  is  the  key.  The  last  paradigm  is  the 
object-oriented  approach,  of  which  the  flavor  system 
developed  for  this  pattern  recognition  system  is  a 
member. 

F.    OBJECT-ORIENTED   PROGRAMMING 

Object-oriented  programming  is  less  a  new  way  of 
programming  than  a  new  way  of  perceiving  how  a  program 
is  to  be  organized.  Data  objects  and  the  functions 
that  operate  on  them  are  linked  together  in  a 
structure.  Instead  of  passing  data  to  a  function,  an 
object  is  requested  to  perform  (message  passing)  some 
operation    (function)    on    itself. 

Each  data  object  is  a  member  of  a  certain  class 
(flavor)  of  objects.  All  data  objects  within  the  same 
class  share  the  same  variables  and  procedures 
(methods).  After  creation,  each  data  object  is 
considered  an  instance  of  its  class.  Each  class  can 
have  multiple  instances,  but  each  instance  belongs  to 
only  one  class.  Calculations  are  accomplished  by 
sending  messages  to  an  instance  of  a  class  and  telling 
it  what   function    (method)    to   execute. 
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A  computer  language  that  supports  object-oriented 
programming  should  have  the  following  attributes; 
information  hiding,  dynamic  binding  and  inheritance 
[33].  The  first  language  containing  these  elements  was 
Smalltalk,    developed  at  XEROX  Corp. 

Information  hiding  and  data  abstraction  are  used  so 
that  the  developer  need  only  focus  on  the  important 
aspects  of  the  software.  A  small  set  of  functions  is 
used  to  set  up  and  run  the  interdependent  modules  in 
the  object-oriented  program.  These  functions  are 
independent  of  the  type  of  object  classes  and  instances 
in  existence.  The  developer  does  not  need  to  know  how 
the  instance  variables  or  methods  are  changed  or 
invoked,  only  the  names  of  the  functions  that 
accomplish  these  actions.  The  variables  in  one 
instance  are  not  changed  by  methods  acting  on  other 
instances  within  the  same  class.  Not  only  is 
information  hidden,    but  it  is  also  protected. 

Dynamic  bindings  of  variables  are  required  so  that  a 
method  does  not  need  to  know  what  data  types  it  will  be 
acting  on,  before  it  is  actually  called.  This  is  an 
aspect   of   data  abstraction.   An  example   of   this  would   be 
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a  stack  routine  with  methods  to  POP  and  POSH.  Ideally, 
one  does  not  want  to  have  to  specify  different  POP  and 
PUSH    routines  for   integer,     real    or   string  stacks    [34]. 

Inheritance  enables  objects  to  be  subclasses  of 
other  classes.  A  subclass  has  knowledge  about 
(inherits)  all  the  variables  and  methods  of  the  class 
it  belongs  to.  In  this  flavor  system,  one  restriction 
added  was  that  a  subclass  cannot  change  the  value  of  a 
variable  from  a  higher-level  class.  This  allows  for  a 
modular  design  of  the  overall  software  system,  combined 
with  information  hiding.  Higher  levels  of  classes  will 
be  more  general,  while  class  specialization  will  be 
accomplished  at  lower  levels  within  the  overall 
structure.  Inheritance  can  be  used  to  decrease  code 
that  must   be    shared   between  different   objects. 


One  major  disadvantage  of  object-oriented 
programming  is  the  access  time  it  adds  when  a  function 
is  invoked.  Message  passing  is  not  as  efficient  as 
calling  a  procedure.  Variable  accessing  and  storage  is 
also  slower  and  more  complicated.  Another  disadvantage 
is  that  a  beginner  cannot  simply  sit  down  and  start 
programming    in    an    object-oriented    environment    without 
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first  learning  the  key  functions  that  control  the 
environment.  More  thought  concerning  the  overall 
structure  of  the  software  must  be  accomplished  before 
programming  can  actually  begin.  Despite  these 
problems,  the  benefits  of  object-oriented  programming 
in  the  development  of  large  software  systems 
considerably   outweigh   the  disadvantages. 

G.    FLAVOR  SOFTWARE 

The  object-oriented  programming  style  developed  for 
use  in  this  pattern  recognition  system  is  called  a 
flavor  system.  It  can  run  independently  of  the  pattern 
recognition  system,  and  be  used  for  general  purpose 
object-oriented  programming.  Flavor  style  programming 
was  introduced  by  Howard  Cannon  (one  of  the  founders  of 
Symbolics,  Inc.)  and  is  now  available  on  several 
different   types   of  LISP  machines    [35], 

This  flavor  system  consists  of  eight  different 
functions:  DEFFLAVOR,  DEFMETHOD,  MAKE.  INSTANCE, 
SEND.  MESSAGE,  GET.VALDE,  PUT.  VALUE,  PRINT.FLAVORS  and 
PRINT.  INSTANCE,  all  stored  in  file  FLAVOR.  DEFFLAVOR 
(DEFine  FLAVOR)  is  the  first  function  required.  It 
needs  three   parameters;   new   flavor   name    (class),    a  list 
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of  variables  (and  any  default  values)  to  be  associated 
with  this  flavor,  and  the  name  of  the  parent  flavor 
(for  inheritance).  To  work  correctly,  the  developer 
must  create  flavors  from  the  top  downward,  ie.  starting 
with  the  most  general  flavor.  The  first  flavor  that 
must  be  created  in  a  system  is  VANILLA.  This  flavor 
contains  variables  and  methods  inherited  by  all  other 
flavors  in  the  system.  The  creation  of  the  VANILLA 
flavor  destroys  any  previously  existing  flavor 
environment,  so  the  developer  must  be  careful  when  it 
is  used.  DEFFLAVOR  modifies  the  global  variable 
flavor. environment  which  keeps  track  of  existing 
flavors  and  instances.  The  developer  never  has  to 
access  flavor. environment  directly.  DEFFLAVOR  also 
creates  a  new  global  variable  which  contains  the  list 
of  variables  and  default  values.  This  variable  is  used 
by  MAKE. INSTANCE  to  create  an  instance  of  the  flavor. 
DEFFLAVOR  returns  an  error  message  if  there  was  not  at 
least  one  new  variable  associated  with  the  new  flavor. 

DEFMETHOD  (DEFine  METHOD)  is  the  second  function 
required.  All  flavors  must  have  at  least  one  unique 
method  associated  with  them.  DEFMETHOD  requires  four 
parameters:      the  flavor   name  for   the  method,    the  method 
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name,  the  demon,  and  the  actual  method.  There  are 
three  types  of  demons  available:  before,  none  and 
after.  Neither  the  before  or  after  demon  methods  are 
required.  If  the  before  demon  is  present,  it  will 
always  be  executed  before  the  main  method  (no  demon). 
If  the  after  demon  exists,  it  will  be  executed  after 
the  main  method.  This  provides  a  way  to  hide 
information.  In  this  pattern  recognition  system,  the 
before  demon  was  generally  used  to  find  the  required 
parameters  for  the  main  method  and  place  them  into  a 
properly  ordered  list.  The  after  demon  was  generally 
used  to  place  the  return  variables  from  the  main  method 
into  their  appropriate  slots  in  the  flavor  environment. 
Using  this  approach  simplifies  the  main  method. 
DEFMETHOD  uses  a  global  variable  which  contains  the 
list  of  methods  for  each  flavor.  Again,  the  developer 
never  has  to  access  this  variable  directly.  DEFMETHOD 
returns  an  error  message  if  the  input  flavor  name  has 
not  already   been  defined  by  DEFFLAVOR. 

Up  to  this  point  in  the  development,  nothing  has 
actually  been  created,  only  defined.  To  create  an 
instance  of  a  flavor  requires  a  call  to  MAKE. INSTANCE. 
This   function  will   work  only   if  DEFFLAVOR  and  at   least 
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one  DEFMETHOD  have  been  previously  executed  for  the 
given  flavor.  It  will  also  work  only  if  an  instance  of 
the  parent  flavor  has  already  been  created.  This  means 
that  before  any  other  instance  is  made,  an  instance  of 
VANILLA  must  be  created.  MAKE. INSTANCE  requires  the 
following  variables:  flavor  name,  name  for  the  new 
instance,  name  of  the  parent  instance,  and  any  initial 
values  for  the  flavor  variables.  The  name  of  the 
parent  instance  is  required  so  that  inheritance  can  be 
strictly  controlled,  especially  when  there  are  multiple 
instances  of  the  parent.  Each  instance  of  a  new  flavor 
will  inherit  only  those  variable  values  belonging  to 
the  parent  instance.  Only  default  flavor  variables  are 
inherited  by  all  the  instances  of  a  flavor.  This 
allows  for  complete  data  isolation  between  instances  of 
a  flavor.  The  altering  of  one  flavor  variable  in  one 
instance  will  not  affect  that  variable  value  in  any 
other  instance  of  that  flavor.  The  structure  of  a 
flavor  is  contained  in  a  record  called  flavor  which  is 
at  the  end  of  the  file  FLAVOR.  A  flavor  record 
consists  of  a  list  of  methods,  a  list  of  variables  and 
the  name  of  its  parent  flavor.  MAKE.INSTANCE  returns 
an  error  if  the  parent  instance  is  not  in  existence,  if 
there  are  no  methods  associated  with  the  flavor  or   if 
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the   flavor  has  not   been   defined  yet. 

After  an  instance  has  been  created,  the  methods 
associated  (or  inherited)  with  the  flavor  can  be 
invoked.  Invoking  a  method  in  an  object-oriented 
system  involves  a  process  referred  to  as  message 
passing.  The  function  that  accomplishes  this  is 
SEND.MESSAGE.  This  function  requires  four  parameters: 
flavor  name,  instance  name,  name  of  the  method  to  be 
invoked  and  a  list  of  required  parameters  (if  any). 
First,  SEND.MESSAGE  tries  to  find  a  before  demon  for 
the  input  method  name.  If  one  is  found,  the  input 
variable  list  (if  any),  is  passed  into  the  before  demon 
and  it  is  executed.  The  return  list  from  the  before 
demon  is  placed  into  a  variable  called  before. return. 
Next,  the  main  method  is  invoked  and  passed  either 
before. return,  if  the  before  demon  exists,  or  the 
original  variable  list.  The  return  from  the  main 
method  is  placed  into  the  variable  main. return. 
Finally,  SEND.MESSAGE  searches  for  an  after  demon  and 
passes  it  main. return,  if  it  exists.  Either 
main. return  or  the  return  from  the  after  demon  is 
returned  when  SEND.MESSAGE  is  completed.  Various  error 
messages  are  returned  if  the  flavor,    instance   or  method 
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do   not   exist,    or  main.return   contained  an   error. 

Two    functions,     GET. VALUE    and    PUT. VALUE,     deal 

exclusively  with  flavor  variables.     GET.VALUE  returns  a 

flag    and    the    value    of    the    variable    requested.       It 

requires   three   parameters:      flavor    name,    instance   name 

and  the  variable  name.      If   no  value  for  the  variable  is 

found  on  the  specific  instance  variable  list,    then  the 

default  variable   list  defined  in  DEFFLAVOR   is  checked. 

Any  variable  with  no  value  on  either  list  is  given  a 

value   of   NIL.      NIL    is  the  default  for   all   variables 

with  no  previously  entered  value.      The  flag  returned  by 

GET.VALUE   is   set   to  T   only    if    the  flavor,    instance  and 

variable  names  were  legal.     If  the  flag  is  NIL,  then  an 

error   occurred.      PUT.VALUE  places  a  new   value  for   a 

flavor  variable  into  the  list    of  one   instance   of   that 

flavor.       it    requires   four    parameters:    flavor    name, 

instance  name,   variable  name  and  new  variable  value. 

By     requiring    the    instance    name,     isolation    between 

instances  of  one  flavor  can  be  maintained.      PUT.VALUE 

returns   an   error   message    if   the   flavor,    instance   or 

variable  name  was   invalid.      Any   flavor   can  request   the 

value   of   a  variable   it  owns   or   inherits    (GET.VALUE), 

but  only   the   flavor   owning   the  variable   can   change  its 
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value    (PUT. VALUE).      This  adds   protection  to  the  system. 

PRINT.FLAVORS  is  used  to  output  the  present  flavor 
environment.  It  prints  out  a  list  of  flavors  that  have 
been  defined  and  the  names  of  all  instances  of  those 
flavors.  PRINT. INSTANCE  is  used  to  print  out  all  the 
ethods,  variables  and  their  values,  both  owned  and 
inherited,  belonging  to  one  instance  of  a  flavor.  The 
required  parameters  are   the   flavor  and  instance  names. 


H.       FLOW    OF   CONTROL 

The  basic  functions  that  make  up  the  pattern 
recognition  system  were  discussed  in  PART  II.  This 
section  will  detail  how  the  flavor  system  interacts 
with  the  pattern  recognition  functions  to  achieve  an 
object-oriented   pattern   recognition   system. 

The  top  function  for  the  entire  system  is  Skeleton, 
which  is  contained  in  file  START_SYSTEM.  This  is  the 
function  the  user  calls  to  start  the  entire  process. 
The  first  information  the  user  is  prompted  for  is 
whether  a  session  transcript  is  desired.  This  will 
allow  the  user  to  keep  a  printout  of  all  the  screen 
displays.      After  this,    the  function  Load.System.Files 
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is  called.  This  function  loads  all  the  compiled  files 
required  by  the  system.  The  names  of  these  files  are 
IN_OUTPUT.V,  CENTER.V,  TRACE.V,  FLAVOR.V,  MOMENT.V, 
REG.V,  CLASSIFY.V,  HATRIX.V  and  SET_UP_SYSTE  M.  V.  All 
are  contained  in  the  VMS  subdirectory  <STOUT.COMPILED>. 
The  next  function  called  is  Set.Up.Flavors.  This  is 
the  key  function  for  the  flavor  system.  In  it,  all  the 
flavors  and  methods  for  this  system  are  defined.  The 
flavor   system   used   in  this  program   is  outlined  below. 

Figure  7.      Organization  of   flavor   system. 


VANILLA 

GENERAL. INFORMATION 
MATRIX     INTERMEDIATE. PROCESS     LIBRARY 


IMAGE 


CLASSIFIER 


The  flavor,  MATRIX,  contains  information  about  the 
individual  arrays;  IMAGE  contains  information  about  the 
individual  patterns  in  each  array;  LIBRARY  contains  the 
classifier  image  library;  INTERMEDIATE. PROCESS  contains 
the  information  about  the  autoregressive  and  zernicke 
moment    algorithms;    and     CLASSIFIER     contains     the 
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classifier  and  clustering  algorithms.  A  complete 

description    of    these    flavors     is    contained    in    the 
comments   in  Set.Up.Flavors   in   file  SET_UP_  SYSTEM. 

After    this    environment    is    defined,     instances    of 
VANILLA,      GENERAL.  INFORMATION,      INTERMEDIATE. PROCESS    and 

CLASSIFIER  are  created.  In  this  system,  the  name  of 
the  instance  of  VANILLA  must  be  "VANILLA". 
PRINT.FLAVORS  is  then  called  to  display  the  current 
flavor  environment.  Next,  a  message  is  sent  to  the 
method,  Get. User. Variables.  This  method  contains  the 
user  prompts  for  all  the  key  system  parameters.  The 
information  requested  from  the  user  includes  the  type 
of  system  desired,  classifier  or  clustering,  the  type 
of  intermediate  process,  the  type  of  classifier,  the 
name  of  the  arrays  to  be  input,  and  the  name  of  the 
library  (if  any).  A  sample  session  is  contained  in 
APPENDIX    1. 

If  a  valid  list  of  arrays  was  loaded,  then  the 
analysis  begins.  If  the  user  is  going  to  classify 
geometric  objects,  then  an  instance  of  flavor  LIBRARY 
is  created.  If  a  valid  library  file  was  loaded  into 
the  environment  in  Get. User. Variables,   then  the  image 
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classification  library  is  configured  by  a  call  to 
method  Create. Library.From.Fil e.  Next,  a  loop  is 
entered  that  continues  until  all  the  arrays  loaded  into 
the  environment  are  analyzed. 

For    each    new    array,     an    instance    of    MATRIX    is 

created.       After    an    image    pixel     is    found    (method 

Find. Image),   a  new  instance  of  IMAGE  is  created.     This 

image    is    analyzed    and    the    appropriate    method    of 

INTERMEDIATE. PROCESS    is   executed  to  obtain  the  feature 

vector.       The    next    steps    depend   on   what    the    user 

desires.        If     classification    is    desired,      then    the 

appropriate   method   from    CLASSIFIER    is    called   to 

classify.     If    clustering    is    desired,     then    all     the 

feature  vectors   are   placed   into   a   list   for    the  method, 

Clustering.      After    the   classification  analysis,    the 

system  outputs  the  results  for  the  present  image.      If 

the  classifier    is  adaptive,    the   user   is   requested  to 

verify    the    classification    of    the    image    before    the 

library    is   updated.      This   cycle   is  continued  until    all 

images    on    all     the    arrays    are    analyzed.       At    the 

conclusion    of    the    program,     the    user    can    store    the 

library    in    a    file.       This    is    especially    useful    for 

adaptive   libraries. 
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IV.    System   Performance 

A.      CPU   Function  Times 

To  assist  the  user  in  making  a  choice  of  the  type  of 
feature  extraction  technique  used  in  the  pattern 
recognition  system,  various  timing  functions  were  used 
and  analyzed.  These  results  are  especially  useful  when 
time  is  an  important  factor.  CPD  times  for  the  low- 
level  functions  allow  the  user  to  analyze  their 
relative  complexities. 

Several  measurements  of  the  pattern  recognition 
system's  performance  can  be  made.  This  section  will 
compare  CPD  times  for  the  key  methods  (functions)  in 
the  low  and  intermediate-level  image  processing 
areas.  The  first  two  figures  (Figures  8  and  9)  are  for 
methods  in  the  low-level   image  processing  section. 

Figure  8  contains  CPD  times  for  the  methods 
Follow.Edge,  Shel  l.Sort.Edge  and  Find. Image.Points. 
Figure  9  contains  CPD  times  for  the  methods 
Calculate.Area,  Calculate. Center.Area,  Col. Sums  and 
Replace. Image.Points.  The  X  axis  is  the  length  of  one 
side  of  the  square  being  analyzed  by  these  low-level 
methods.      The   area   of    the   square   increases  from  9   to 
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Figure  8.      Low- level  Functions  -  CPU  Times    (I) 
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Figure  9.      Low-level  Functions  -   CPU  Times    (II) 
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169  pixels.  Calcul ate. Area  +  Calculate. Center. Area 
shows  very  little  change  as  the  size  of  the  square 
increases.  These  two  methods  multiply  and  divide 
several  real  numbers  together.  The  method  Col. Sums 
shows  a  nearly  linear  increase  in  time  until  the  last 
square.  The  method  Follow.Edge  also  shows  a  fairly 
steady  rise,  without  the  sudden  jump  at  the  end.  The 
other  three  methods,  Shell. Sort.Edge,  Find. Image.Points 
and  Replace. Image. Points,  appear  to  increase  with  the 
square  of  the  length  (in  proportion  to  the  area).  This 
is  expected  for  F i n d . Im ag e . Po i n t s  and 
Replace.Image.Points,  because  they  consider  all  image 
points.  Shel 1. Sort. Edge  sorts  the  edge  points  only, 
but  the  algorithm  is  not  a  linear  function  of  the 
number  of  edge  points.  It  is  clear  that 
Find. Image. Points  will  increasingly  dominate  CPU  time 
as   the   size   of   the   image   grows. 

The  next  set  of  figures  (Figures  10,  11,  12,  13,  and 
14)  measure  CPD  times  for  various  intermediate-level 
image  processing  methods.  These  times  can  be  useful  to 
the  system  developer  when  time  is  a  key  consideration 
in  the  analysis  process.  Figure  10  compares  the  times 
for     the    two    Zernicke    moment     feature    extraction 
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techniques.  The  Zernicke  technique  using  only  the  edge 
points  increases  linearly  with  time,  while  the  one 
using  all  image  points  increases  in  proportion  to  the 
square  of  the  length.  When  time  is  a  critical  factor 
and  large  images  are  involved,  the  technique  using  all 
the  image  points  may  have  to  be  avoided,  despite  it 
being  more  accurate. 

Figures  11  through  14  are  CPU  times  for  various 
combinations  of  ray  and  lag  components  in  the 
autoregressive  model  feature  extraction  technique. 
Figure  11  shows  the  relationships  between  the  size  of  a 
square  and  three  different  ray  numbers,  for  one  lag 
component.  Figure  12  shows  the  same  relationships, 
except  using  two  lag  components.  From  these  two 
figures,  one  can  see  that  the  addition  of  rays  does  not 
significantly  increase  the  CPO  time  for  the 
autoregressive  technique.  Figure  13  shows  the 
relationship  between  the  number  of  lag  components  and 
the  area  of  the  image.  For  32  ray  slopes,  increasing 
the  number  of  lag  components  does  not  significantly 
increase  the  CPU  time  for  the  method.  Figure  14 
compares  times  for  various  combinations  of  lag 
components  and  ray  slope  quantities.      From  this  plot, 
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Figure   10.      Zernicke  Moment  Function  -  Cpu  Times 
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Figure  11.   Autoregressive  Model  Function  -  CPU  Times  (I) 
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Figure  12.   Autoregressive  Model  Function  -  CPU  Times  (II) 
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Figure   13.      Autoregressive  Model  Function  -  CPU  Times    (III) 
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Figure  14.      Autoregressive  Model  Function  -   CPU  Times    (IV) 
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one  can  see  that  there  is  a  significant  increase,  at 
all  lag  values,  when  the  number  of  ray  slopes  is 
increased  from  8  or  16,  to  32.  These  results  from  the 
autoregressive  method  leave  the  user  with  a  lot  of 
discretion  for  parameter  selection.  Comparing  the  CPU 
times  with  those  from  the  Zernicke  moment  technique 
shows  that  the  autoregressive  technique  uses  less  time 
than  the  long  version  (using  all  image  pixels)  of  the 
Zernicke  method,  but  more  time  than  the  short  version 
(using   edge   points,    only). 

B.    Invariance   Properties 

The  invariance  of  the  two  feature  extraction 
techniques  to  translational,  scale  and  rotational 
transformations  will  determine  their  utility  in  this 
pattern  recognition  system.  The  translational  and 
scale  invariances  will   be   investigated  together. 

Figures  15  through  20  apply  to  the  autoregressive 
feature  extraction  technique.  In  each  of  these 
figures,  the  maximum  variation  will  be  calculated. 
This  number  is  the  percentage  error  of  the  feature 
element  with  the  largest  change  on  the  plot.  Error  is 
the    smallest    value    of    that    element    divided    by    the 
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largest  value.  An  error  of  zero  means  the  feature 
element  did  not  change.  Scale  invariance  was 
calculated  using  a  square  with  an  area  varying  from  9 
to  221.  Rotational  invariance  was  calculated  using  a 
square  with  an  area  equal  to  100,  rotated  through  45 
degrees,    in   5   degree   increments. 

Figures  15  and  16  contain  the  feature  elements  for 
the  model  using  one  lag  component.  Only  scale 
invariance  was  considered  for  one  lag  component.  The 
maximum  variation  for  the  A/B1/2  term  is  12%,  for  the 
term  calculated  from  16  rays.  Figure  16  contains  the 
lag  component  for  the  feature  vector  and  has  a  maximum 
variation  of  1%,  for  the  16  ray  term.  For  the  simple 
distance  classifier  method,  the  largest  feature  element 
will  have  the  most  impact  on  the  distance  between  two 
image  feature  vectors.  The  lag  component  has  a  value 
20  to  30  times  larger  than  the  A/B1/2  term,  so  the 
scale  invariance  error   is   close  to  1%. 

Figures  17  through  20  contain  feature  elements  using 
two  1 ag  component s.  Both  scale  and  rotational 
invariances  are  considered  in  this  example,  and  only  16 
or   32    rays   are    used.      The   A/B1/2    term   is    plotted   in 
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Figure   15.      Autoregressive  Model  Feature  Element    (I) 
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Figure  16.   Autoregressive  Model  Feature  Element  (II) 
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Figure  17.   Autoregressive  Model  Feature  Element  (III) 
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Figure  18.   Autoregressive  Model  Feature  Element  (IV) 
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Figures  17  and  18.  Its  maximum  variation  is  19%  for 
the  scale  changes  and  40%  for  the  rotational  changes. 
Both  occurred  with  the  32  ray  example.  Considering 
both  16  and  32  rays,  the  maximum  variation  is  about 
twice  as  much  for  the  rotated  images  as  for  the  scaled 
images.  Figures  19  and  20  plot  the  larger  of  the  two 
lag  components.  The  maximum  variation  for  the  scaled 
images  is  22%.  The  maximum  variation  for  the  rotated 
images  is  23%.  Both  occurred  with  the  32  ray  example. 
This  lag  component  is  the  largest  of  the  three  elements 
in  the  feature  vector,  so  its  variation  will  have  the 
most  impact  on  the  overall  error. 

A  few  conclusions  can  be  drawn  from  these  examples. 
The  autoregressive  model  technique  offers  better  scale 
invariance  than  rotational  invariance,  for  the  set  of 
images  analyzed.  Larger  sized  images  will  exhibit 
better  rotational  invariance  than  smaller  ones.  There 
also  seems  to  be  no  real  advantage  to  using  32  rays 
instead  of  16,  for  images  of  the  size  analyzed  here. 
For  larger  images,  or  more  complicated  ones,  a  larger 
number  of  rays  will  be  an  advantage. 

The  next  set  of  plots  (Figures  21  through  27) 
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Figure   19.      Autoregressive  Model  Feature  Element    (V) 
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Figure  20.   Autoregressive  Model  Feature  Element  (VI) 
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concerns  the  Zernicke  moment  feature  extraction 
technique.  Figures  21,  22  and  23  are  plots  of  feature 
elements  for  this  technique  using  all  image  points. 
Figures  21  and  22  are  plots  of  one  second  order  moment 
and  one  fourth  order  moment.  The  scaled  version  has  a 
maximum  variation  of  6%  for  Z40.  All  the  rest  of  the 
second  and  third  order  moments  for  the  scaled  images 
were  zero,  so  this  technique  exhibits  excellent  scale 
invariance.  The  rotated  squares  (Figure  22)  with  the 
same  two  moments  have  a  maximum  variation  of  4%  for 
Z40.  Figure  23  contains  the  plots  for  two  second  order 
and  one  third  order  moment  for  the  rotated  squares. 
These  same  three  moments  were  all  zero  for  the  scaled 
squares.  No  variation  measurements  are  possible 
because  of  the  zero  value  for  the  0  degree  case.  The 
values  vary  considerably,  but  they  are  all 
approximately  3  to  4  orders  of  magnitude  smaller  than 
the  moments  on  Figure  22.  From  these  results,  one  can 
see  that  this  technique  also  exhibits  good  rotational 
invariance.  This  technique  has  better  invariance 
characteristics  than  the  autoregr essi ve  model 
technique,    for   the   type   of   images   analyzed. 

The    next    set    of    plots    (Figures    24    through    27) 
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Figure   21.      Zernlcke  Moment  Feature  Element    (I) 
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Figure  22.      Zernicke  Moment  Feature  Element    (II) 
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Figure  23.   Zernicke  Moment  Feature  Element  (III) 
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concerns  the  Zernicke  moment  technique  using  just  the 
edge  points.  As  is  expected,  this  technique  is  not  as 
accurate  as  the  one  using  all  the  image  points.  Figure 
24  is  a  plot  of  the  two  largest  moments,  Z2u  an<3  Z40 
for  the  scaled  squares.  Its  maximum  variation  is  7.5% 
for  Z40.  Figure  25  is  a  plot  of  the  same  two  moments 
for  the  rotated  images.  Its  maximum  variation  is  much 
larger  than  indicated,  because  the  points  for  the  45 
degree  rotation  were  omitted.  The  45  degree  rotation 
caused  a  large  increase  in  the  size  of  Z40,  from  0.6  to 
9.4.  The  other  moment  has  a  maximum  variation  of  16%, 
including  45  degrees.  Figures  26  and  27  are  plots  of 
the  other  second  order  moment  and  the  two  largest  third 
order  moments.  Figure  26  depicts  the  scaled  squares 
and  has  a  maximum  variation  of  79%  for  Z22.  The 
rotated  squares  are  shown  in  Figure  27.  The  same 
problem  with  the  values  for  the  45  degree  rotation  also 
occurred  here,  for  all  three  moments.  Z31  went  from 
0.08  to  11.1.  This  figure  shows  that  the  rotational 
invariance  worsens  after  25  degrees,  for  these  three 
moments. 

These  results  clearly  show  that  the  Zernicke  moment 
technique,  using  just  the  edge  points,  is  the  worst  of 
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Figure   24.      Zemicke  Moment  Feature  Elements    (I-A) 
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Figure  25.   Zernicke  Moment  Feature  Elements  (II-A) 
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Figure  26.      Zernicke  Moment  Feature  Elements    (III-A) 
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Figure  27.   Zernicke  Moment  Feature  Elements  (IV-A) 
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the  three  techniques,  with  respect  to  scale  and 
especially  rotational  invariance.  The  Zernicke  moment 
technique,  using  all  the  image  points,  is  better  than 
any  of  the  autoregressive  model  examples.  As  one  would 
expect,  the  CPU  times  for  the  three  techniques  are 
proportional  to  their  quality.  For  images  less  than 
200  pixels  in  area,  the  Zernicke  technique,  using  all 
the  image  points,  is  the  best  one.  For  larger  images, 
a   time  versus  quality   tradeoff  must   be  made. 

C.   Clustering 

Clustering  groups  a  set  of  images  by  some  distance 
measurement  and  organizes  them  in  a  tree-like 
structure  (dendogram).  As  the  distance  increases,  the 
images  are  more  and  more  dissimilar.  The  sample  images 
that  were  clustered  are  contained  in  the  array  depicted 
in  Figure  28.  The  numbers  underneath  the  X  axis  in 
Figures  29  and  30  correspond  to  the  numbers  of  the 
images   in   Figure   28. 

Figures  29  and  30  are  dendograms  obtained  from  data 
from  the  clustering  algorithm.  These  examples 
demonstrate  the  feature  extraction  techniques  and  the 
simple     distance     classifier     algorithm.        Figure    29 
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Figure  28.   Set  of  images  to  cluster. 
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contains  the  clusters  obtained  from  the  Zernicke  moment 
technique,  using  all  the  image  points.  The  simple 
distance  classifier  was  used  and  the  distance  metric 
was  the  Chebychev  one.  The  Chebychev  metric  sets  the 
distance  between  two  images  to  the  largest  difference 
between  the  feature  vector  elements.  Figure  30 
contains  clusters  obtained  from  the  autoregressive 
technique,  using  2  lag  components  and  32  rays.  The 
same  classifier  was  used,  but  the  distance  metric  was 
the  Euclidean  distance  squared.  The  differences  in  the 
two  dendograms  highlight  the  fact  that  each  feature 
extraction  technique  and  each  distance  metric  obtain 
unique  similarities  and  dissimilarities  within  a  set  of 
images.  No  general  guidelines  are  available  to  assist 
the   user    in  making   the   "proper"   choice   of   methods. 

D.    Example  Runs 

Two  different  sets  of  images  were  used  to 
thoroughly  test  this  pattern  recognition  system.  The 
first  test  involved  classifying  three  different  types 
of  geometric  shapes,  rectangle,  triangle  and  arrow. 
The  arrow  was  chosen  because  it  is  a  combination  of  a 
rectangle  and  triangle.  The  area  of  the  images  varied 
from  26  pixels  to  180.     Rotations  from  0  degrees  to  45 
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Figure   29.      Dendogram-   Zernicke 
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Figure   30.      Dendogram  -  Autoregressive 
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degrees  were  also  included.  Many  different 
combinations  of  techniques  were  run  and  the  results  are 
listed  in  Table  1.  A  total  of  15  images  were  used. 
The  first  three  images  were  used  to  set  up  the 
classifier,  therefore  only  12  decisions  were  made  by 
the  system.  Three  categories  of  answers  are  used,  the 
answer  is  correct,  the  answer  is  wrong,  the  answer  is 
wrong  but  the  second  closest  choice  is  correct.  For 
the  adaptive  classifier,  only  second  order  moments  were 
possible  because  of  real  number  precision  restrictions 
with  INTERLISP  on  the  VAX.  The  distance  metric  used 
for  all  the  simple  classifier  examples  is  the  Euclidean 
distance    squared. 
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Table  1.      Results  from  Test  1. 

Technique  Classifier     Right  Wrong   2nd 

Zernicke  (all,        2  moments)  Adaptive 

Zernicke  (edge,      2  moments)  Adaptive 

Autoreg.  (2  lag,  32   rays)  Adaptive 

Autoreg.  (2   lag,    16    rays)  Adaptive 

Zernicke  (all,        2   moments)  Non-adapt. 

Zernicke  (all,        6   moments)  Non-adapt. 

Zernicke  (all,        11   moments)  Non-adapt. 

Zernicke  (edge,      2  moments)  Non-adapt. 

Zernicke  (edge,      6   moments)  Non-adapt. 

Zernicke  (edge,      11   moments)  Non-adapt. 

Autoreg.  (2   lag,   32    rays)  Non-adapt. 

Autoreg.  (2   lag,    16    rays)  Non-adapt. 

The  most  important  conclusion  from  Test  1  is  that 
the  adaptive  classifier,  with  its  varying  feature 
weights,  is  less  effective  than  the  simple  distance 
classifier.  To  analyze  these  results  in  more  detail, 
the  four  measurements  of  classifier  quality  were 
calculated  after  every  three  images  were  input.  This 
allows  one   to   check   class   separations   for    the  adaptive 
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classifier.  Table  2  shows  the  value  of  quality  measure 
Q4  for  the  adaptive  methods  listed  in  Table  1,  after 
the   number    of    images   indicated  have  been   input. 

Table  2.      Q4  values  for  Adaptive  Classifiers 

Technique  9  12  15 

Zernicke  (all,   2  moments)  33.7  29.65  10.418 

Zernicke  (edge,    2   moments)  11.5  1.31          l.H 

Autoreg.  (2   lag,   32    rays)  147.5  168.2            7.65 

Autoreg.  (2   lag,    16    rays)  13.0  16.7  10.7 

From  these  results,  one  can  see  that  part  of  the 
reason  that  the  adaptive  classifiers  are  doing  poorly 
is  that  the  between  class  separations  are  decreasing, 
and  within  class  separations  are  growing.  This  type  of 
classifier  does  not  work  well  for  sets  of  images  with 
overlapping  partitions.  No  improvement  is  obtained  as 
the   number    of    images   increases. 

Another  important  result  of  this  test  is  that  the 
Zernicke  moment  technique,  using  all  the  image  points, 
provides  the  best   feature  vector  for  classification 
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purposes.  This  was  expected,  after  the  analysis  of  the 
scaled  and  rotated  squares.  Also,  the  autoregressive 
technique  is  shown  to  be  more  efficient  than  the 
Zernicke  technique,  using  just  the  edge  points.  Not 
much  improvement  was  found  by  increasing  the  number  of 
rays  from  16  to  32.  The  last  conclusion  that  can  be 
made  about  this  test  is  that  the  amount  of  information 
contained  in  the  third  and  fourth  order  moments  for  the 
Zernicke  technique  is  not  great.  For  the  non-adaptive 
classifier,  reducing  the  number  of  moments  did  not 
affect  the  classification  success  much. 

The  second  test  of  this  pattern  recognition  system 
used  a  set  of  block  capital  letters,  "A",  "E",  and  "I". 
The  area  of  the  letters  was  not  changed  much  but  their 
height  and  width  were.  Edges  were  also  smeared,  as  if 
noise  had  degraded  the  original  letters.  Some  of  the 
letters  were  also  placed  on  their  sides.  Results  from 
this  test  were  similar  to  the  above  test,  and  are 
contained  in  Table  3.  No  record  was  kept  of  answers 
that  were  wrong  but  had  the  second  guess  correct.  A 
total  of  15  images  were  used,  with  the  first  three 
forming  the  classifier. 
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Table  3.      Results  from  Test  2. 

Technique  Classifier  Right  Wrong 

Zernicke    (all,      11   moments)  Adaptive  7  6 

Zernicke    (edge,    11   moments)  Adaptive  6  7 

Zernicke    (all,        2   moments)  Adaptive  8  5 

Zernicke    (edge,    11   moments)  Adaptive  6  7 

Autoreg.    (2   lag,   32    rays)  Adaptive  5  8 

Autoreg.    (2   lag,    16    rays)  Adaptive  8  5 

Zernicke    (all,        2  moments)  Non-adapt.  10  3 

Zernicke    (all,        11   moments)  Non-adapt.  11  2 

Zernicke    (edge,      2  moments)  Non-adapt.  8  5 

Zernicke    (edge,      11   moments)  Non-adapt.  7  6 

Autoreg.    (2   lag,   32    rays)  Non-adapt.  6  7 

Autoreg.    (2   lag,    16    rays)  Non-adapt.  3  io 


The  results  are  similar  to  those  from  Test  1,  but 
the  adaptive  classifier  performs  even  worse  than 
before,  for  the  Zernicke  technique.  The  adaptive 
classifier  is  helpful  in  the  autoregressive  technique. 
Once  again,  the  Zernicke  technique,  using  all  the  image 
points,  provided  the  best  feature  vector  for 
classification    purposes.       The    higher    order    moments 
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contributed  little,  or  nothing  to  the  classification 
process.  Classification  actually  got  worse  with  the 
higher  order  moments.  This  is  realistic  because  higher 
order  moments  are  likely  be  more  sensitive  to  noise. 
The  classifier  quality  measure,  Q4,  was  calculated  in 
the  middle  and  the  end  of  this  test.  In  all  adaptive 
cases,  its  value  decreased.  This  is  an  indicator  that 
the  image  classes  were  overlapping  again.  In  this  test 
the  Zernicke  technique,  using  just  the  edge  points,  was 
more  successful  than  the  autoregressive  technique.  No 
clear  advantage  between  16  and  32  rays  can  be  noted, 
since  16  rays  was  better  for  the  adaptive  classifier 
and  32   rays  was  better  for   the  non-adaptive  classifier. 

E.    Summary 

A  comprehensive  pattern  recognition  system  was 
developed  in  software,  using  INTERLISP,  that  allows  the 
user  to  input,  store,  analyze,  extract  features,  and 
classify  any  2-dimensional,  binary  geometric  object. 
The  user  can  choose  between  two  different  statistical 
feature  extraction  techniques  (autoregressive  model  or 
Zernicke  moments),  and  two  different  types  of 
classifiers  (simple  or  weighted).  Several  different 
distance  metrics   are   available   for   the  simple  distance 
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classifier.  Libraries  of  images,  containing  the  image 
classes  used  in  the  classification  process,  can  be 
input  from  a  file,  built  dynamically  (adaptively) ,  and 
stored  back  into  a  file.  Measurements  of  the  quality 
of  the  library  image  classes  can  be  calculated,  and  are 
especially  useful  when  an  adaptive  classifier  is  used. 
If  no  library  of  images  is  available  or  desirable,  the 
user  can  analyze  the  similarities  and  dissimilarities 
between  a  set  of  input  images  using  a  technique  called 
clustering.  The  user  chooses  among  these  options  in 
response   to  a   series   of    system   prompts. 

After  running  a  series  of  images  through  the  pattern 
recognition  system,  the  following  conclusions  were 
made.  The  Zernicke  moment  technique,  using  all  the 
image  points,  provided  the  best  feature  vector  for 
classification  purposes.  It  was  the  most  invariant  to 
translational ,  scale  and  rotational  transformations. 
No  clear  second  best  technique  was  found.  CPU  time 
will  be  an  important  factor  for  large  images  (greater 
than  500  pixels),  especially  for  the  Zernicke  technique 
using  all    the   image   points. 

The  weighted  classifier,    although  more  sophisticated 
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than  the  simple  distance  classifier,  did  not  prove  to 
be  the  more  accurate  of  the  two.  When  there  is  a 
possibility  of  having  blurred  or  noisy  images  in  each 
image  class,  the  simple,  non-adaptive  distance 
classifier  may  prove  to  be  a  more  accurate  classifier 
than   the  weighted   one. 

The  object-oriented  flavor  system  developed 
independently  of  the  pattern  recognition  system 
provided  an  excellent  software  development  environment. 
A  set  of  eight  INTERLISP  functions  comprise  this  flavor 
system.  These  eight  functions  can  be  used 
independently  of  the  pattern  recognition  software, 
allowing  the  user  to  develop  other  systems  utilizing 
the  features  of  object-oriented  programming.  Object- 
oriented  programming  provides  the  user  with  data 
abstraction,  information  hiding  and  protection,  late 
variable  bindings  and  inheritance.  In  software 
systems,  such  as  this  pattern  recognition  one,  object- 
oriented  programming  provides  an  excellent  way  to 
represent,  manipulate  and  display  system  knowledge 
(knowledge  representation).  This  type  of  environment, 
because  of  its  modularity,  enables  the  developer  to 
easily   modify   and/or   add  new   functions    (methods)    to  the 
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system,  without  having  to  change  a  lot  of  code.  The 
overall  structure  of  the  system  is  more  visible  to  the 
user. 

Future  enhancements  for  this  system  could  include 
more  feature  extraction  techniques  and  more 
sophisticated  classifiers.  A  more  sophisticated 
classifier  is  needed  for  overlapping  image  regions.  A 
good,  high-level  image  processing  section  would  assist 
the  classifier  in  making  choices  between  two,  close 
image  classes.  Syntactic  techniques  would  allow  this 
pattern  recognition  system  to  identify  lines,  edges  and 
corners.  More  sophisticated  techniques  would  allow  the 
system  to  analyze  the  relationship  between  objects. 
These  techniques  will  be  necessary  if  3-dimensional 
image  processing  is  desired.  Adding  an  interface  to 
a  camera  would  allow  the  system  to  be  used  directly  in 
image  processing.  A  more  distant,  future  enhancement 
would  take  advantage  of  a  multi-processor  programming 
environment  to  calculate  the  Zernicke  moments  more 
rapidly.  The  algorithm  used  in  the  function  is  ideal 
for  division  among  several  processors,  and  would  make  a 
real-time   system  possible. 
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Appendix  I.      Sample  Session 

NOTE:  This  program  does  not  run  well  on  the  graphics 
terminals  because  of  their  continuous  scrolling.  For 
better  results,  use  the  other  terminals  and  press 
CONTROL  NO-SCROLL.  This  eliminates  continuous 
scrolling  for   these   terminals. 
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A  computer  vision  system  consists  of  five 
components:  data  acquisition,  low,  intermediate  and 
high-level  image  processing,  and  classification.  The 
pattern  recognition  system  described  here  is  concerned 
with  the  software  implementation  of  the  low, 
intermediate-level  and  classification  processes.  The 
system  was  developed  using  INTERLISP  and  allows  the 
user  to  input,  store,  analyze,  extract  features,  and 
classify  any  2-dimensional,    binary  geometric  object. 

The  low-level  processing  contains  the  algorithms 
that  trace  the  edge  of  an  image  and  calculate  its 
center  of  area.  The  intermediate-level  processing 
contains  two  statistical  feature  extraction 
techniques:  the  autoregressive  model  and  the  Zernicke 
moments.  The  feature  vectors  produced  by  these 
techniques  are  relatively  insensitive  to  translational, 
scale  and  rotational  transformations.  The  zernicke 
technique,  using  all  image  points,  obtained  feature 
elements  that  were  the  most  invariant  to  these 
transformations. 

The    classifier    contains    two    distance    measuring 


algorithms  that  identify  unknown  objects  by  their 
proximity  to  known  image  classes.  Class  membership  is 
based  on  the  smallest  distance  between  the  image  and  a 
class.  The  simple  classifier  uses  Euclidean  distance 
measurements  to  obtain  a  value  for  the  distance  between 
an   unknown   image   and  an  image   class. 

A  more  complicated  classification  technique 
transforms  the  feature  space  in  such  a  way  that  the 
distance  between  images  within  a  class  is  reduced.  An 
adaptive  classifier  using  this  technique,  proved  to  be 
less  accurate  than  the  simple  distance  classifier, 
especially   for   blurred  or  noisy   images. 

When  no  image  classes  are  available  or  desired,  the 
user  can  cluster  a  set  of  unknown  images.  This 
technique  creates  a  tree-like  structure  in  which 
objects  are  clustered  according  to  their  degree  of 
similarity. 

The  pattern  recognition  system  is  placed  into  an 
object-oriented  software  shell.  This  shell  (flavor 
system)  is  an  independent,  general-purpose  software 
development   tool    that   the    user    can   use   to   build  and 


execute  complicated  programs.  It  consists  of  eight 
INTERLISP  functions  that  provide  all  the  advantages  of 
object-oriented  programming.  These  advantages  are 
information  hiding,  data  abstraction,  dynamic  binding 
and  inheritance.  This  flavor  system  provides  an 
excellent  way  to  represent,  manipulate  and  display 
system  knowledge. 


